diff --git a/chapters/means-and-contrasts.qmd b/chapters/means-and-contrasts.qmd index 416c04c..07d335d 100644 --- a/chapters/means-and-contrasts.qmd +++ b/chapters/means-and-contrasts.qmd @@ -6,13 +6,12 @@ execute: ## Background +To start off with, we need to define estimated marginal means (EMM). Estimated marginal means are defined as marginal means of a variable across all levels of other variables in a model, essentially giving a "population-level" average. -Estimated marginal means are defined as marginal means of a variable across all levels of other variables in a model, essentially giving a "population-level" average. -model predictions over the grid comprising all factor combinations – called the reference grid +The emmeans package is one of the most commonly used package in R in determine EMMs. This package provides methods for obtaining EMMs (also known as **least-squares means**) for factor combinations in a variety of models. The emmeans package is one of several alternatives to facilitate post hoc methods application and contrast analysis. It is a relatively recent replacement for the lsmeans package that some R users may be familiar with. It is intended for use with a wide variety of ANOVA models, including repeated measures and nested designs(mixed models). This is a flexible package has a set of detailed [**vignettes**](https://cran.r-project.org/web/packages/emmeans/vignettes/AQuickStart.html) and works with a lot of different model objects. In this chapter, we will demonstrate the extended use of the emmeans package to calculate estimated marginal means and contrasts. - To demonstrate the use of the emmeans() package. We will pull the model from split plot lesson ([**Chapter 6**](split-plot-design.qmd)), where we evaluated the effect of Nitrogen and Variety on Oat yield. This data contains 6 blocks, 3 main plots (Variety) and 4 subplots (Nitrogen). The primary outcome variable was oat yield. To read more about the experiment layout details please read RCBD split-plot section in [**Chapter 6**](split-plot-design.qmd). @@ -26,19 +25,17 @@ For demonstration of the `emmeans` package, we are fitting model with `nlme` pac Let's start the analysis by loading the required libraries for fitting linear mixed models using `nlme` package. ## Analysis Examples - ```{r, warning=FALSE, message=FALSE} library(nlme); library(performance); library(emmeans) library(dplyr); library(broom.mixed) ``` - ### Import data +Let's import oats data from the MASS package. ```{r} data1 <- MASS::oats ``` - ### Model fitting ```{r} model1 <- lme(Y ~ V + N + V:N , @@ -48,8 +45,6 @@ model1 <- lme(Y ~ V + N + V:N , tidy(model1) ``` - - ### Check Model Assumptions ```{r, fig.height=3} @@ -57,41 +52,35 @@ check_model(model1, check = c('normality', 'linearity')) ``` ### Model Inference - ```{r} anova(model1, type = "marginal") ``` ### Estimated Marginal Means +Now that we have a good model, let’s use the emmeans() function to obtain EMMs. + Now that we have fitted a linear mixed model (model1) and it meets the model assumption. Let's use the `emmeans()` function to obtain estimated marginal means for main (variety and nitrogen) and interaction (variety x nitrogen) effects. #### Main effects - ```{r} m1 <- emmeans(model1, ~V, level = 0.95) m1 +``` +```{r} m2 <- emmeans(model1, ~N) m2 ``` - #### Interaction effects - ```{r} m3 <- emmeans(model1, ~V*N) m3 - - m4 <- emmeans(model1, ~V|N) m4 - ``` - - - ```{r, eval=FALSE} One-way estimated marginal means and plot diff --git a/docs/search.json b/docs/search.json index d33f7ea..5f628e8 100644 --- a/docs/search.json +++ b/docs/search.json @@ -662,7 +662,7 @@ "href": "chapters/means-and-contrasts.html", "title": "12  Marginal Means & Contrasts", "section": "", - "text": "12.1 Estimated Marginal Means\nCompact letter display\n#library(multcomp); library(multcompView)\n#cld(m1, Letters= letters)\nThe letters indicating significant differences can be generated using cld() function from the ‘multcomp’ package”. In the output below, groups sharing a letter in the .group are not statistically different from each other.", + "text": "12.1 Background\nTo start off with, we need to define estimated marginal means (EMM). Estimated marginal means are defined as marginal means of a variable across all levels of other variables in a model, essentially giving a “population-level” average.\nThe emmeans package is one of the most commonly used package in R in determine EMMs. This package provides methods for obtaining EMMs (also known as least-squares means) for factor combinations in a variety of models. The emmeans package is one of several alternatives to facilitate post hoc methods application and contrast analysis. It is a relatively recent replacement for the lsmeans package that some R users may be familiar with. It is intended for use with a wide variety of ANOVA models, including repeated measures and nested designs(mixed models). This is a flexible package has a set of detailed vignettes and works with a lot of different model objects.\nIn this chapter, we will demonstrate the extended use of the emmeans package to calculate estimated marginal means and contrasts.\nTo demonstrate the use of the emmeans() package. We will pull the model from split plot lesson (Chapter 6), where we evaluated the effect of Nitrogen and Variety on Oat yield. This data contains 6 blocks, 3 main plots (Variety) and 4 subplots (Nitrogen). The primary outcome variable was oat yield. To read more about the experiment layout details please read RCBD split-plot section in Chapter 6.\nLet’s start the analysis by loading the required libraries for fitting linear mixed models using nlme package.", "crumbs": [ "12  Marginal Means and Contrasts" ] @@ -840,7 +840,7 @@ "href": "chapters/means-and-contrasts.html#analysis-examples", "title": "12  Marginal Means & Contrasts", "section": "12.2 Analysis Examples", - "text": "12.2 Analysis Examples\n\nlibrary(nlme); library(performance); library(emmeans)\nlibrary(dplyr); library(broom.mixed)\n\n\n12.2.1 Import data\n\ndata1 <- MASS::oats\n\n\n\n12.2.2 Model fitting\n\nmodel1 <- lme(Y ~ V + N + V:N ,\n random = ~1|B/V,\n data = data1, \n na.action = na.exclude)\ntidy(model1)\n\n\n\n12.2.3 Check Model Assumptions\n\ncheck_model(model1, check = c('normality', 'linearity'))\n\n\n\n12.2.4 Model Inference\n\nanova(model1, type = \"marginal\")\n\n\n\n12.2.5 Estimated Marginal Means\nNow that we have fitted a linear mixed model (model1) and it meets the model assumption. Let’s use the emmeans() function to obtain estimated marginal means for main (variety and nitrogen) and interaction (variety x nitrogen) effects.\n\n12.2.5.1 Main effects\n\nm1 <- emmeans(model1, ~V, level = 0.95)\nm1\n\nm2 <- emmeans(model1, ~N)\nm2\n\n\n\n12.2.5.2 Interaction effects\n\nm3 <- emmeans(model1, ~V*N)\nm3\n\n\nm4 <- emmeans(model1, ~V|N)\nm4\n\n\nOne-way estimated marginal means and plot\n\nlibrary(multcomp)\nlibrary(emmeans)\n\nmarginal = emmeans(model,\n ~ Location)\n\nCLD = cld(marginal,\n alpha=0.05,\n Letters=letters,\n adjust=\"tukey\")\n\nCLD\n\n\nLocation emmean SE df lower.CL upper.CL .group\n Olympia 8.333333 0.6718548 16 6.449596 10.21707 a \n Northampton 11.833333 0.6718548 16 9.949596 13.71707 b \n Ventura 13.333333 0.6718548 16 11.449596 15.21707 b \n Burlington 21.833333 0.6718548 16 19.949596 23.71707 c \n\n\n\n### Order the levels for printing\n\nCLD$Location = factor(CLD$Location,\n levels=c(\"Olympia\", \"Ventura\", \"Northampton\", \"Burlington\"))\n\n### Remove spaces in .group \n\nCLD$.group=gsub(\" \", \"\", CLD$.group)\n\n\n### Plot\n\nlibrary(ggplot2)\n\nggplot(CLD,\n aes(x = Location,\n y = emmean,\n label = .group)) +\n\n geom_point(shape = 15,\n size = 4) +\n\n geom_errorbar(aes(ymin = lower.CL,\n ymax = upper.CL),\n width = 0.2,\n size = 0.7) +\n\n theme_bw() +\n theme(axis.title = element_text(face = \"bold\"),\n axis.text = element_text(face = \"bold\"),\n plot.caption = element_text(hjust = 0)) +\n\n ylab(\"Estimated marginal mean\\nmidichlorian count\") +\n ggtitle (\"Midichlorian counts\",\n\n subtitle = \"In four U.S. cities\") +\n\n labs(caption = paste0(\"\\nMidichlorian counts for four locations. \",\n \"Boxes indicate the EM mean. \\n\",\n \"Error bars indicate the 95% \",\n \"confidence interval of the EM mean. \\n\",\n \"Means sharing a letter are not \",\n \"significantly different (Tukey-adjusted \\n\",\n \"comparisons).\"),\n hjust=0.5) +\n\n geom_text(nudge_x = c(0, 0, 0, 0),\n nudge_y = c(4, 4, 4, 4),\n color = \"black\")\n\n\n\n\n12.2.6 Interactions using Emmeans\n\nInteraction plot of estimated marginal means\n\nlibrary(multcomp)\nlibrary(emmeans)\n\nmarginal = emmeans(model,\n ~ Tribe:Location)\n\nCLD = cld(marginal,\n alpha=0.05,\n Letters=letters,\n adjust=\"tukey\")\n\nCLD\n\n\nTribe Location emmean SE df lower.CL upper.CL .group\n Sith Olympia 4.333333 0.9501462 16 1.354477 7.31219 a \n Jedi Northampton 8.666667 0.9501462 16 5.687810 11.64552 ab \n Sith Ventura 10.666667 0.9501462 16 7.687810 13.64552 bc \n Jedi Olympia 12.333333 0.9501462 16 9.354477 15.31219 bcd\n Sith Northampton 15.000000 0.9501462 16 12.021143 17.97886 cd\n Jedi Ventura 16.000000 0.9501462 16 13.021143 18.97886 d\n Jedi Burlington 20.666667 0.9501462 16 17.687810 23.64552 e\n Sith Burlington 23.000000 0.9501462 16 20.021143 25.97886 e\n\n\n\n### Order the levels for printing\n\nCLD$Location = factor(CLD$Location,\n levels=c(\"Olympia\", \"Ventura\", \"Northampton\", \"Burlington\"))\n\nCLD$Tribe = factor(CLD$Tribe,\n levels=c(\"Jedi\", \"Sith\"))\n\n### Remove spaces in .group \n\nCLD$.group=gsub(\" \", \"\", CLD$.group)\n\n\nCLD\n\n\n### Plot\n\nlibrary(ggplot2)\n\npd = position_dodge(0.4) ### How much to jitter the points on the plot\n\nggplot(CLD,\n aes(x = Location,\n y = emmean,\n color = Tribe,\n label = .group)) +\n\n geom_point(shape = 15,\n size = 4,\n position = pd) +\n\n geom_errorbar(aes(ymin = lower.CL,\n ymax = upper.CL),\n width = 0.2,\n size = 0.7,\n position = pd) +\n\n theme_bw() +\n theme(axis.title = element_text(face = \"bold\"),\n axis.text = element_text(face = \"bold\"),\n plot.caption = element_text(hjust = 0)) +\n\n ylab(\"Estimated marginal mean\\nmidichlorian count\") +\n ggtitle (\"Midichlorian counts for Jedi and Sith\",\n subtitle = \"In four U.S. cities\") +\n \n labs(caption = paste0(\"\\nMidichlorian counts for two tribes across \",\n \"four locations. Boxes indicate \\n\",\n \"the EM mean. \",\n \"Error bars indicate the 95% confidence \",\n \"interval \",\n \"of the EM \\n\",\n \"mean. Means sharing a letter are \",\n \"not significantly different \\n\",\n \"(Tukey-adjusted comparisons).\"),\n hjust=0.5) +\n \n geom_text(nudge_x = c(0.1, -0.1, 0.1, -0.1, 0.1, -0.1, -0.1, 0.1),\n nudge_y = c(4.5, 4.5, 4.5, 4.5, 4.5 , 4.5, 4.5, 4.5),\n color = \"black\") +\n \n scale_color_manual(values = c(\"blue\", \"red\"))\n\n\n\n12.2.7 Contrasts using Emmeans\n\n(warp.emm <- emmeans(warp.lm, ~ tension | wool))\n\ncontrast(warp.emm, \"poly\")\n\nCompact letter display\n\n#library(multcomp); library(multcompView)\n\n\n#cld(m1, Letters= letters)\n\nThe letters indicating significant differences can be generated using cld() function from the ‘multcomp’ package”. In the output below, groups sharing a letter in the .group are not statistically different from each other.\nP values, “significance”, and recommendations : https://cran.r-project.org/web/packages/emmeans/vignettes/basics.html#emms\nSummary of main points EMMs are derived from a model. A different model for the same data may lead to different EMMs. EMMs are based on a reference grid consisting of all combinations of factor levels, with each covariate set to its average (by default). For purposes of defining the reference grid, dimensions of a multivariate response are treated as levels of a factor. EMMs are then predictions on this reference grid, or marginal averages thereof (equally weighted by default). Reference grids may be modified using at or other arguments for ref_grid() Reference grids and emmeans() results may be plotted via plot() (for parallel confidence intervals) or emmip() (for an interaction-style plot). Be cautious with the terms “significant” and “nonsignificant”, and don’t ever interpret a “nonsignificant” result as saying that there is no effect. Follow good statistical practices such as getting the model right first, and using adjusted P values for appropriately chosen families of comparisons or contrasts.\n\n\n\n\n\n\nCautionary Note about CLD\n\n\n\nIt’s important to note that we cannot conclude that treatment levels with the same letter are the same. We can only conclude that they are not different.\nThere is a separate branch of statistics, “equivalence testing” that is for ascertaining if things are sufficiently similar to conclude they are equivalent.\nSee ?sec-cld_warning for additional warnings about problems with using compact letter display.\n\n\n\n\n\n\n\n\nMore details on emmeans\n\n\n\nIf you want to read more about emmeans, please refer to vignettes on this CRAN page.", + "text": "12.2 Analysis Examples\n\nlibrary(nlme); library(performance); library(emmeans)\nlibrary(dplyr); library(broom.mixed)\n\n\n12.2.1 Import data\nLet’s import oats data from the MASS package.\n\ndata1 <- MASS::oats\n\n\n\n12.2.2 Model fitting\n\nmodel1 <- lme(Y ~ V + N + V:N ,\n random = ~1|B/V,\n data = data1, \n na.action = na.exclude)\ntidy(model1)\n\n\n\n12.2.3 Check Model Assumptions\n\ncheck_model(model1, check = c('normality', 'linearity'))\n\n\n\n12.2.4 Model Inference\n\nanova(model1, type = \"marginal\")\n\n\n\n12.2.5 Estimated Marginal Means\nNow that we have a good model, let’s use the emmeans() function to obtain EMMs.\nNow that we have fitted a linear mixed model (model1) and it meets the model assumption. Let’s use the emmeans() function to obtain estimated marginal means for main (variety and nitrogen) and interaction (variety x nitrogen) effects.\n\n12.2.5.1 Main effects\n\nm1 <- emmeans(model1, ~V, level = 0.95)\nm1\n\n\nm2 <- emmeans(model1, ~N)\nm2\n\n\n\n12.2.5.2 Interaction effects\n\nm3 <- emmeans(model1, ~V*N)\nm3\nm4 <- emmeans(model1, ~V|N)\nm4\n\n\nOne-way estimated marginal means and plot\n\nlibrary(multcomp)\nlibrary(emmeans)\n\nmarginal = emmeans(model,\n ~ Location)\n\nCLD = cld(marginal,\n alpha=0.05,\n Letters=letters,\n adjust=\"tukey\")\n\nCLD\n\n\nLocation emmean SE df lower.CL upper.CL .group\n Olympia 8.333333 0.6718548 16 6.449596 10.21707 a \n Northampton 11.833333 0.6718548 16 9.949596 13.71707 b \n Ventura 13.333333 0.6718548 16 11.449596 15.21707 b \n Burlington 21.833333 0.6718548 16 19.949596 23.71707 c \n\n\n\n### Order the levels for printing\n\nCLD$Location = factor(CLD$Location,\n levels=c(\"Olympia\", \"Ventura\", \"Northampton\", \"Burlington\"))\n\n### Remove spaces in .group \n\nCLD$.group=gsub(\" \", \"\", CLD$.group)\n\n\n### Plot\n\nlibrary(ggplot2)\n\nggplot(CLD,\n aes(x = Location,\n y = emmean,\n label = .group)) +\n\n geom_point(shape = 15,\n size = 4) +\n\n geom_errorbar(aes(ymin = lower.CL,\n ymax = upper.CL),\n width = 0.2,\n size = 0.7) +\n\n theme_bw() +\n theme(axis.title = element_text(face = \"bold\"),\n axis.text = element_text(face = \"bold\"),\n plot.caption = element_text(hjust = 0)) +\n\n ylab(\"Estimated marginal mean\\nmidichlorian count\") +\n ggtitle (\"Midichlorian counts\",\n\n subtitle = \"In four U.S. cities\") +\n\n labs(caption = paste0(\"\\nMidichlorian counts for four locations. \",\n \"Boxes indicate the EM mean. \\n\",\n \"Error bars indicate the 95% \",\n \"confidence interval of the EM mean. \\n\",\n \"Means sharing a letter are not \",\n \"significantly different (Tukey-adjusted \\n\",\n \"comparisons).\"),\n hjust=0.5) +\n\n geom_text(nudge_x = c(0, 0, 0, 0),\n nudge_y = c(4, 4, 4, 4),\n color = \"black\")\n\n\n\n\n12.2.6 Interactions using Emmeans\n\nInteraction plot of estimated marginal means\n\nlibrary(multcomp)\nlibrary(emmeans)\n\nmarginal = emmeans(model,\n ~ Tribe:Location)\n\nCLD = cld(marginal,\n alpha=0.05,\n Letters=letters,\n adjust=\"tukey\")\n\nCLD\n\n\nTribe Location emmean SE df lower.CL upper.CL .group\n Sith Olympia 4.333333 0.9501462 16 1.354477 7.31219 a \n Jedi Northampton 8.666667 0.9501462 16 5.687810 11.64552 ab \n Sith Ventura 10.666667 0.9501462 16 7.687810 13.64552 bc \n Jedi Olympia 12.333333 0.9501462 16 9.354477 15.31219 bcd\n Sith Northampton 15.000000 0.9501462 16 12.021143 17.97886 cd\n Jedi Ventura 16.000000 0.9501462 16 13.021143 18.97886 d\n Jedi Burlington 20.666667 0.9501462 16 17.687810 23.64552 e\n Sith Burlington 23.000000 0.9501462 16 20.021143 25.97886 e\n\n\n\n### Order the levels for printing\n\nCLD$Location = factor(CLD$Location,\n levels=c(\"Olympia\", \"Ventura\", \"Northampton\", \"Burlington\"))\n\nCLD$Tribe = factor(CLD$Tribe,\n levels=c(\"Jedi\", \"Sith\"))\n\n### Remove spaces in .group \n\nCLD$.group=gsub(\" \", \"\", CLD$.group)\n\n\nCLD\n\n\n### Plot\n\nlibrary(ggplot2)\n\npd = position_dodge(0.4) ### How much to jitter the points on the plot\n\nggplot(CLD,\n aes(x = Location,\n y = emmean,\n color = Tribe,\n label = .group)) +\n\n geom_point(shape = 15,\n size = 4,\n position = pd) +\n\n geom_errorbar(aes(ymin = lower.CL,\n ymax = upper.CL),\n width = 0.2,\n size = 0.7,\n position = pd) +\n\n theme_bw() +\n theme(axis.title = element_text(face = \"bold\"),\n axis.text = element_text(face = \"bold\"),\n plot.caption = element_text(hjust = 0)) +\n\n ylab(\"Estimated marginal mean\\nmidichlorian count\") +\n ggtitle (\"Midichlorian counts for Jedi and Sith\",\n subtitle = \"In four U.S. cities\") +\n \n labs(caption = paste0(\"\\nMidichlorian counts for two tribes across \",\n \"four locations. Boxes indicate \\n\",\n \"the EM mean. \",\n \"Error bars indicate the 95% confidence \",\n \"interval \",\n \"of the EM \\n\",\n \"mean. Means sharing a letter are \",\n \"not significantly different \\n\",\n \"(Tukey-adjusted comparisons).\"),\n hjust=0.5) +\n \n geom_text(nudge_x = c(0.1, -0.1, 0.1, -0.1, 0.1, -0.1, -0.1, 0.1),\n nudge_y = c(4.5, 4.5, 4.5, 4.5, 4.5 , 4.5, 4.5, 4.5),\n color = \"black\") +\n \n scale_color_manual(values = c(\"blue\", \"red\"))\n\n\n\n12.2.7 Contrasts using Emmeans\n\n(warp.emm <- emmeans(warp.lm, ~ tension | wool))\n\ncontrast(warp.emm, \"poly\")\n\nCompact letter display\n\n#library(multcomp); library(multcompView)\n\n\n#cld(m1, Letters= letters)\n\nThe letters indicating significant differences can be generated using cld() function from the ‘multcomp’ package”. In the output below, groups sharing a letter in the .group are not statistically different from each other.", "crumbs": [ "12  Marginal Means and Contrasts" ] @@ -854,5 +854,15 @@ "crumbs": [ "12  Marginal Means and Contrasts" ] + }, + { + "objectID": "chapters/means-and-contrasts.html#using-adjusted-p-values", + "href": "chapters/means-and-contrasts.html#using-adjusted-p-values", + "title": "12  Marginal Means & Contrasts", + "section": "12.3 using adjusted P-values", + "text": "12.3 using adjusted P-values\nP values, “significance”, and recommendations : https://cran.r-project.org/web/packages/emmeans/vignettes/basics.html#emms\nSummary of main points EMMs are derived from a model. A different model for the same data may lead to different EMMs. EMMs are based on a reference grid consisting of all combinations of factor levels, with each covariate set to its average (by default). For purposes of defining the reference grid, dimensions of a multivariate response are treated as levels of a factor. EMMs are then predictions on this reference grid, or marginal averages thereof (equally weighted by default). Reference grids may be modified using at or other arguments for ref_grid() Reference grids and emmeans() results may be plotted via plot() (for parallel confidence intervals) or emmip() (for an interaction-style plot). Be cautious with the terms “significant” and “nonsignificant”, and don’t ever interpret a “nonsignificant” result as saying that there is no effect. Follow good statistical practices such as getting the model right first, and using adjusted P values for appropriately chosen families of comparisons or contrasts.\n\n\n\n\n\n\nCautionary Note about CLD\n\n\n\nIt’s important to note that we cannot conclude that treatment levels with the same letter are the same. We can only conclude that they are not different.\nThere is a separate branch of statistics, “equivalence testing” that is for ascertaining if things are sufficiently similar to conclude they are equivalent.\nSee ?sec-cld_warning for additional warnings about problems with using compact letter display.\n\n\n\n\n\n\n\n\nMore details on emmeans\n\n\n\nIf you want to read more about emmeans, please refer to vignettes on this CRAN page.", + "crumbs": [ + "12  Marginal Means and Contrasts" + ] } ] \ No newline at end of file diff --git a/renv.lock b/renv.lock index 6be96c7..ae3a180 100644 --- a/renv.lock +++ b/renv.lock @@ -42,7 +42,7 @@ }, "MASS": { "Package": "MASS", - "Version": "7.3-60.2", + "Version": "7.3-61", "Source": "Repository", "Repository": "CRAN", "Requirements": [ @@ -53,11 +53,11 @@ "stats", "utils" ], - "Hash": "2f342c46163b0b54d7b64d1f798e2c78" + "Hash": "0cafd6f0500e5deba33be22c46bf6055" }, "Matrix": { "Package": "Matrix", - "Version": "1.7-0", + "Version": "1.7-1", "Source": "Repository", "Repository": "CRAN", "Requirements": [ @@ -70,7 +70,7 @@ "stats", "utils" ], - "Hash": "1920b2f11133b12350024297d8a4ff4a" + "Hash": "5122bb14d8736372411f955e1b16bc8a" }, "MatrixModels": { "Package": "MatrixModels",