From 9ab38f8edc5354d8f4188cff83d0fae761820baa Mon Sep 17 00:00:00 2001 From: Harpreet Kaur Date: Tue, 23 Jul 2024 12:57:35 -0700 Subject: [PATCH] incomplete block design --- chapters/Factorial_design.qmd | 2 + chapters/Incomplete_block.qmd | 103 ++++++++++++++++++++++++++++++++++ chapters/background.qmd | 12 ++++ chapters/split_plot.qmd | 42 +++++++------- 4 files changed, 137 insertions(+), 22 deletions(-) create mode 100644 chapters/Incomplete_block.qmd diff --git a/chapters/Factorial_design.qmd b/chapters/Factorial_design.qmd index 43e3c9f..e77c488 100644 --- a/chapters/Factorial_design.qmd +++ b/chapters/Factorial_design.qmd @@ -4,6 +4,8 @@ title: "Factorial_design" ## Background +add description of factorial design + ::: callout-note ## A note ::: diff --git a/chapters/Incomplete_block.qmd b/chapters/Incomplete_block.qmd new file mode 100644 index 0000000..05fd25b --- /dev/null +++ b/chapters/Incomplete_block.qmd @@ -0,0 +1,103 @@ +--- +title: "Balanced incomplete block experiment" +--- + +# Balanced incomplete block experiment + +## Background + +The block design in **link RCBD guide** was complete, meaning that every block contained all the treatments. In practice, it may not be possible to have too many treatments in each block. Sometimes, there are also situations where it is advised to not have many treatments in each block. + +In such cases, incomplete block designs are used where we have to decide what subset of treatments to be used in an individual block. This will work well if we enough blocks. However, if we only have small number of blocks, there would be the risk that certain quantities are not estimable anymore. + +To avoid having a disconnected design, a balanced incomplete block design can be used + +::: callout-note +## A note + + +::: + + +## Example Analysis + +https://kwstat.github.io/agridat/reference/weiss.incblock.html + +```{r} + library(agridat) + data(weiss.incblock) + dat <- weiss.incblock +``` + +```{r} + library(desplot) + desplot(dat, yield~col*row, + text=gen, shorten='none', cex=.6, out1=block, + aspect=252/96, # true aspect + main="weiss.incblock") + +``` + +::: {.panel-tabset} + +### lme4 + +```{r, message=FALSE} + +``` + +### tidymodels + +```{r, message=FALSE} + +``` + + +::: + + + + +| | | +|----------|----------------------------------------| + + +### *Data integrity checks* + + + +### Model Building + + +::: {.column-margin} + +Recall the model: + +$$ $$ + +::: + +Here is the R syntax for that statistical model: + +::: {.panel-tabset} + +### lme4 + +```{r} + +``` + +### tidymodels + +```{r} + +``` + +::: + + + +### Check Model Assumptions + + +### Inference diff --git a/chapters/background.qmd b/chapters/background.qmd index 2f75909..4900a97 100644 --- a/chapters/background.qmd +++ b/chapters/background.qmd @@ -31,3 +31,15 @@ $$ Y = (\beta_0 + ai) + (\beta_1 + b_i)(X) + 𝜺$$ In this model, a*i* and b*i* are random effects for subject *i* applied to the intercept and slope, respectively. Predictions would vary depending on each subject’s slope and intercept terms: ![Mixed Model with random intercept and slope](/img/random_intercept_and_slope.png) + +### Random-effect syntax + +- (1\| group): Random intercept with fixed mean. + +- (1\| g1/g2): intercepts vary among g1 and g2 within g1. + +- (1 \| g1) + (1 \| g2): random intercepts for 2 variables. + +- x + (x \| g): correlated random slope and intercept. + +- x + (x \|\| g): uncorrelated random slope and intercept. diff --git a/chapters/split_plot.qmd b/chapters/split_plot.qmd index effd6d4..6907188 100644 --- a/chapters/split_plot.qmd +++ b/chapters/split_plot.qmd @@ -30,12 +30,17 @@ $$ \epsilon \sim N(0, \sigma_\epsilon)$$ $$\ \delta \sim N(0, \sigma_\delta)$$ -Both the overall error and the rep effects are assumed to be normally distributed with a mean of zero and standard deviations of $\sigma$ and $sigma_B$, respectively. +Both the overall error and the rep effects are assumed to be normally distributed with a mean of zero and standard deviations of $\sigma$ and $\sigma_\delta$, respectively. + +#### 'iid' assumption for error terms + +In these model, the error terms, $\epsilon$ are assumed to be "iid", that is, independently and identically distributed. This means they have constant variance and they each individual error term is independent from the others. ![Split Plot CRD Design](images/split_plot_CRD-01.jpeg){fig-align="center" width="341"} ![Split Plot RCBD Design](images/Split_plot_RCBD.jpeg){fig-align="center" width="348"} -## Analysis Examples + +### Analysis Examples ### Example model for CRD Split Plot Designs @@ -51,7 +56,6 @@ library(ggplot2) library(emmeans) library(lme4) library(multcompView) -install.packages("performance") library(performance) ``` @@ -64,7 +68,6 @@ height_data <- read_excel(here::here("data", "height_data.xlsx")) table(height_data$time, height_data$manage) str(height_data) - ``` 3. Explore data @@ -83,22 +86,18 @@ data(gomez.splitplot.subsample.txt) The statistical model structure for split plot design: -$$y_{ijk} = \mu + \rho_i + \alpha_j + \beta_k + (\alpha_j\beta_k) + \epsilon_{ijk}$$ +$$y_{ijk} = \mu + \gamma_i + \alpha_j + \beta_k + (\alpha_j\beta_k) + \epsilon_{ijk}$$ Where: -$\mu$ = overall experimental mean, $\rho$ = block/rep effect (random), $\alpha$ = main effect of whole plot (fixed), $\beta$ = main effect of subplot (fixed), $\alpha$$\beta$ = interaction between factors A and B, $\epsilon$ = error. +$\mu$ = overall experimental mean, $\gamma$ = block/rep effect (random), $\alpha$ = main effect of whole plot (fixed), $\beta$ = main effect of subplot (fixed), $\alpha$$\beta$ = interaction between factors A and B, $\epsilon$ = error. $$ \epsilon \sim N(0, \sigma)$$ -$$ \rho \sim N(0, \sigma_b)$$ +$$ \gamma \sim N(0, \sigma_b)$$ Both the overall error and the rep effects are assumed to be normally distributed with a mean of zero and standard deviations of $\sigma$ and $sigma_B$, respectively. -#### 'iid' assumption for error terms - -In this model, the error terms, $\epsilon$ are assumed to be "iid", that is, independently and identically distributed. This means they have constant variance and they each individual error term is independent from the others. - ```{r} model1<-lmer(height ~ time*manage + (1|rep/time), data=height_data) @@ -189,9 +188,15 @@ Post-Hoc analysis emm <- emmeans(model2, ~ variety * nutrient) comparison <- cld(emm, Letters = LETTERS, reversed = T) comparison ``` +::: callout-note +## `na.action = na.exclude` + +The RCBD split-plot design is also commonly called split-block Design or the strip-plot Design +::: + ### Split-split plot design in R -In this example, we have a yield data of 3 different rice varieties grown under 3 management practices and 5 Nitrogen levels. In this spliy-split design: +In this example, we have a yield data of 3 different rice varieties grown under 3 management practices and 5 Nitrogen levels. In this split-split design: blocks = block (3 blocks), @@ -203,6 +208,8 @@ sub-subplot = variety (3 levels) Statistical model: +**add model equation here** + Here, we are extracting the rice yield data from `agricolae` package. ```{r} @@ -226,16 +233,7 @@ rice$management<-factor(rice$management) rice$variety<-factor(rice$variety) ``` -Statistical model - -```{r} -library(ggplot2) -library(emmeans) -library(lme4) -library(multcompView) -install.packages("performance") -library(performance) -``` +Statistical model: Here is the basic split-split plot analysis. We can use the nesting notation in the random part because nitrogen and management are nested in blocks. We can do blocks as fixed or random.