Skip to content

Commit

Permalink
add some overview
Browse files Browse the repository at this point in the history
  • Loading branch information
gavinsimpson committed Oct 14, 2022
1 parent 2c89ba4 commit af47247
Show file tree
Hide file tree
Showing 2 changed files with 180 additions and 0 deletions.
90 changes: 90 additions & 0 deletions day-5/index.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -581,6 +581,96 @@ class: inverse center middle subsection

# Example

---
class: inverse center middle subsection

# Overview

---

# Overview

* We choose to use GAMs when we expect non-linear relationships between covariates and $y$

* GAMs represent non-linear functions $fj(x_{ij})$ using splines

* Splines are big functions made up of little functions — *basis function*

* Estimate a coefficient $\beta_k$ for each basis function $b_k$

* As a user we need to set `k` the upper limit on the wiggliness for each $f_j()$

* Avoid overfitting through a wiggliness penalty — curvature or 2nd derivative

---

# Overview

* GAMs are just fancy GLMs — usual diagnostics apply `gam.check()` or `appraise()`

* Check you have the right distribution `family` using QQ plot, plot of residuls vs $\eta_i$, DHARMa residuals

* But have to check that the value(s) of `k` were large enough with `k.check()`

* Model selection can be done with `select = TRUE` or `bs = "ts"` or `bs = "cs"`

* Plot your fitted smooths using `plot.gam()` or `draw()`

* Produce hypotheticals using `data_slice()` and `fitted_values()` or `predict()`

---

# Overview

* Avoid fitting multiple models dropping terms in turn

* Can use AIC to select among mondels for prediction

* GAMs should be fitted with `method = "REML"` or `"ML"`

* Then they are an empirical Bayesian model

* Can explore uncertainty in estimates by smapling from the posterior of smooths or the model

---

# Overview

* The default basis is the low-rank thin plate spline

* Good properties but can be slow to set up — use `bs = "cr"` with big data

* Other basis types are available — most aren't needed in general but do have specific uses

* Tensor product smooths allow us to add smooth interactions to our models with `te()` or `t2()`

* `s()` can be used for multivariate smooths, but assumes isotropy

* Use `s(x) + s(z) + ti(x,z)` to test for an interaction

---

# Overview

* Smoothing temporal or spatial data can be tricky due to autocorrelation

* In some cases we can fit separate smooth trends & autocorrelatation processes

* But they can fail often

* Including smooths of space and time in your model can remove other effects: **confounding**

---

# Overview

* {mgcv} smooths can be used in other software

* Bayesian GAMs well catered for with {brms}

* Consider more than the mean parameter — distributional GAMs

* Consider modeling empirical quantiles using quantile GAMs

---

Expand Down
90 changes: 90 additions & 0 deletions day-5/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -520,6 +520,96 @@

# Example

---
class: inverse center middle subsection

# Overview

---

# Overview

* We choose to use GAMs when we expect non-linear relationships between covariates and `\(y\)`

* GAMs represent non-linear functions `\(fj(x_{ij})\)` using splines

* Splines are big functions made up of little functions — *basis function*

* Estimate a coefficient `\(\beta_k\)` for each basis function `\(b_k\)`

* As a user we need to set `k` the upper limit on the wiggliness for each `\(f_j()\)`

* Avoid overfitting through a wiggliness penalty — curvature or 2nd derivative

---

# Overview

* GAMs are just fancy GLMs — usual diagnostics apply `gam.check()` or `appraise()`

* Check you have the right distribution `family` using QQ plot, plot of residuls vs `\(\eta_i\)`, DHARMa residuals

* But have to check that the value(s) of `k` were large enough with `k.check()`

* Model selection can be done with `select = TRUE` or `bs = "ts"` or `bs = "cs"`

* Plot your fitted smooths using `plot.gam()` or `draw()`

* Produce hypotheticals using `data_slice()` and `fitted_values()` or `predict()`

---

# Overview

* Avoid fitting multiple models dropping terms in turn

* Can use AIC to select among mondels for prediction

* GAMs should be fitted with `method = "REML"` or `"ML"`

* Then they are an empirical Bayesian model

* Can explore uncertainty in estimates by smapling from the posterior of smooths or the model

---

# Overview

* The default basis is the low-rank thin plate spline

* Good properties but can be slow to set up — use `bs = "cr"` with big data

* Other basis types are available — most aren't needed in general but do have specific uses

* Tensor product smooths allow us to add smooth interactions to our models with `te()` or `t2()`

* `s()` can be used for multivariate smooths, but assumes isotropy

* Use `s(x) + s(z) + ti(x,z)` to test for an interaction

---

# Overview

* Smoothing temporal or spatial data can be tricky due to autocorrelation

* In some cases we can fit separate smooth trends & autocorrelatation processes

* But they can fail often

* Including smooths of space and time in your model can remove other effects: **confounding**

---

# Overview

* {mgcv} smooths can be used in other software

* Bayesian GAMs well catered for with {brms}

* Consider more than the mean parameter — distributional GAMs

* Consider modeling empirical quantiles using quantile GAMs

---

Expand Down

0 comments on commit af47247

Please sign in to comment.