incomplete block design

IdahoAgStats · Jul 23, 2024 · 9ab38f8 · 9ab38f8
1 parent 5b913bb
commit 9ab38f8
Show file tree

Hide file tree

Showing 4 changed files with 137 additions and 22 deletions.
diff --git a/chapters/Factorial_design.qmd b/chapters/Factorial_design.qmd
@@ -4,6 +4,8 @@ title: "Factorial_design"
 
 ## Background
 
+add description of factorial design
+
 ::: callout-note
 ## A note
 :::

diff --git a/chapters/Incomplete_block.qmd b/chapters/Incomplete_block.qmd
@@ -0,0 +1,103 @@
+---
+title: "Balanced incomplete block experiment"
+---
+
+# Balanced incomplete block experiment
+
+## Background
+
+The block design in **link RCBD guide** was complete, meaning that every block contained all the treatments. In practice, it may not be possible to have too many treatments in each block. Sometimes, there are also situations where it is advised to not have many treatments in each block.
+
+In such cases, incomplete block designs are used where we have to decide what subset of treatments to be used in an individual block. This will work well if we enough blocks. However, if we only have small number of blocks, there would be the risk that certain quantities are not estimable anymore.
+
+To avoid having a disconnected design, a balanced incomplete block design can be used 
+
+::: callout-note
+## A note
+
+
+:::
+
+
+## Example Analysis
+
+https://kwstat.github.io/agridat/reference/weiss.incblock.html
+
+```{r}
+ library(agridat)
+  data(weiss.incblock)
+  dat <- weiss.incblock
+```
+
+```{r}
+ library(desplot)
+  desplot(dat, yield~col*row,
+          text=gen, shorten='none', cex=.6, out1=block,
+          aspect=252/96, # true aspect
+          main="weiss.incblock")
+  
+```
+
+::: {.panel-tabset}
+
+### lme4
+
+```{r, message=FALSE}
+
+```
+
+### tidymodels
+
+```{r, message=FALSE}
+
+```
+
+
+:::
+
+
+
+
+|   |   |
+|----------|----------------------------------------|    
+
+
+### *Data integrity checks*
+
+
+
+### Model Building
+
+
+::: {.column-margin}
+
+Recall the model:
+
+$$ $$ 
+
+:::
+
+Here is the R syntax for that statistical model:
+
+::: {.panel-tabset}
+
+### lme4
+
+```{r}
+
+```
+
+### tidymodels
+
+```{r}
+
+```
+
+:::
+
+
+
+### Check Model Assumptions
+
+
+### Inference
diff --git a/chapters/background.qmd b/chapters/background.qmd
@@ -31,3 +31,15 @@ $$  Y = (\beta_0 + ai) + (\beta_1 + b_i)(X) + 𝜺$$
 In this model, a*i* and b*i* are random effects for subject *i* applied to the intercept and slope, respectively. Predictions would vary depending on each subject’s slope and intercept terms:
 
 ![Mixed Model with random intercept and slope](/img/random_intercept_and_slope.png)
+
+### Random-effect syntax
+
+-   (1\| group): Random intercept with fixed mean.
+
+-   (1\| g1/g2): intercepts vary among g1 and g2 within g1.
+
+-   (1 \| g1) + (1 \| g2): random intercepts for 2 variables.
+
+-   x + (x \| g): correlated random slope and intercept.
+
+-   x + (x \|\| g): uncorrelated random slope and intercept.
diff --git a/chapters/split_plot.qmd b/chapters/split_plot.qmd
@@ -30,12 +30,17 @@ $$ \epsilon \sim N(0, \sigma_\epsilon)$$
 
 $$\ \delta  \sim N(0, \sigma_\delta)$$
 
-Both the overall error and the rep effects are assumed to be normally distributed with a mean of zero and standard deviations of $\sigma$ and $sigma_B$, respectively.
+Both the overall error and the rep effects are assumed to be normally distributed with a mean of zero and standard deviations of $\sigma$ and $\sigma_\delta$, respectively.
+
+#### 'iid' assumption for error terms
+
+In these model, the error terms, $\epsilon$ are assumed to be "iid", that is, independently and identically distributed. This means they have constant variance and they each individual error term is independent from the others.
 
 ![Split Plot CRD Design](images/split_plot_CRD-01.jpeg){fig-align="center" width="341"}
 
 ![Split Plot RCBD Design](images/Split_plot_RCBD.jpeg){fig-align="center" width="348"}
-## Analysis Examples
+
+### Analysis Examples
 
 ### Example model for CRD Split Plot Designs
 
@@ -51,7 +56,6 @@ library(ggplot2)
 library(emmeans)
 library(lme4)
 library(multcompView)
-install.packages("performance")
 library(performance)
 ```
 
@@ -64,7 +68,6 @@ height_data <- read_excel(here::here("data", "height_data.xlsx"))
 table(height_data$time, height_data$manage)
 
 str(height_data)
-
 ```
 
 3.  Explore data
@@ -83,22 +86,18 @@ data(gomez.splitplot.subsample.txt)
 
 The statistical model structure for split plot design:
 
-$$y_{ijk} = \mu + \rho_i +  \alpha_j + \beta_k + (\alpha_j\beta_k) + \epsilon_{ijk}$$ 
+$$y_{ijk} = \mu + \gamma_i +  \alpha_j + \beta_k + (\alpha_j\beta_k) + \epsilon_{ijk}$$
 
 Where:
 
-$\mu$ = overall experimental mean, $\rho$ = block/rep effect (random), $\alpha$ = main effect of whole plot (fixed), $\beta$ = main effect of subplot (fixed), $\alpha$$\beta$ = interaction between factors A and B, $\epsilon$ = error.
+$\mu$ = overall experimental mean, $\gamma$ = block/rep effect (random), $\alpha$ = main effect of whole plot (fixed), $\beta$ = main effect of subplot (fixed), $\alpha$$\beta$ = interaction between factors A and B, $\epsilon$ = error.
 
 $$ \epsilon \sim N(0, \sigma)$$
 
-$$ \rho \sim N(0, \sigma_b)$$
+$$ \gamma \sim N(0, \sigma_b)$$
 
 Both the overall error and the rep effects are assumed to be normally distributed with a mean of zero and standard deviations of $\sigma$ and $sigma_B$, respectively.
 
-#### 'iid' assumption for error terms
-
-In this model, the error terms, $\epsilon$ are assumed to be "iid", that is, independently and identically distributed. This means they have constant variance and they each individual error term is independent from the others.
-
 ```{r}
 model1<-lmer(height ~ time*manage + (1|rep/time), data=height_data)
 
@@ -189,9 +188,15 @@ Post-Hoc analysis
 emm <- emmeans(model2, ~ variety * nutrient) comparison <- cld(emm, Letters = LETTERS, reversed = T) comparison
 ```
 
+::: callout-note
+## `na.action = na.exclude`
+
+The RCBD split-plot design is also commonly called split-block Design or the strip-plot Design
+:::
+
 ### Split-split plot design in R
 
-In this example, we have a yield data of 3 different rice varieties grown under 3 management practices and 5 Nitrogen levels. In this spliy-split design:
+In this example, we have a yield data of 3 different rice varieties grown under 3 management practices and 5 Nitrogen levels. In this split-split design:
 
 blocks = block (3 blocks),
 
@@ -203,6 +208,8 @@ sub-subplot = variety (3 levels)
 
 Statistical model:
 
+**add model equation here**
+
 Here, we are extracting the rice yield data from `agricolae` package.
 
 ```{r}
@@ -226,16 +233,7 @@ rice$management<-factor(rice$management)
 rice$variety<-factor(rice$variety)
 ```
 
-Statistical model
-
-```{r}
-library(ggplot2)
-library(emmeans)
-library(lme4)
-library(multcompView)
-install.packages("performance")
-library(performance)
-```
+Statistical model:
 
 Here is the basic split-split plot analysis. We can use the nesting notation in the random part because nitrogen and management are nested in blocks. We can do blocks as fixed or random.