Skip to content

Commit

Permalink
revisions
Browse files Browse the repository at this point in the history
  • Loading branch information
Sidhuharp97 committed Jan 13, 2025
1 parent 6a51ee8 commit 6e011b8
Show file tree
Hide file tree
Showing 4 changed files with 149 additions and 62 deletions.
13 changes: 6 additions & 7 deletions chapters/factorial-design.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ library(dplyr); library(performance)
```
:::

Next, we will load the dataset named 'cochran.factorial' from the '**agridat**' package. This data comprises a yield response of beans to different levels of manure (d), nitrogen (n), phosphorus The goal of this analysis is the estimate the effect on d, n, p, k, and their interactions on bean yield.
Next, we will load the dataset named 'cochran.factorial' from the '**agridat**' package. This data comprises a yield response of beans to different levels of manure (d), nitrogen (n), phosphorus. The goal of this analysis is the estimate the effect of d, n, p, k, and their interactions on bean yield.

Note, while importing the data, d, n, p, and k were converted into factor variables using the `mutate()` function from dplyr package. This helps in reducing the extra steps of converting each single variable to factor manually.

Expand Down Expand Up @@ -62,7 +62,7 @@ The objective of this example is evaluate the individual and interactive effect

### Data Integrity Checks

Verify the class of variables, where rep, block, d, n, p, and k are supposed to be a factor/character and yield should be numeric/integer.
First step is to Verify the class of variables, where rep, block, d, n, p, and k are supposed to be a factor/character and yield should be numeric/integer.

```{r}
str(data1)
Expand Down Expand Up @@ -91,8 +91,7 @@ hist(data1$yield, main = "", xlab = "yield", cex.lab = 1.8, cex.axis = 1.5)
```{r, eval=FALSE}
hist(data1$yield)
```

The range is roughly falling into the expected range. I didn't observe any extreme observations (too high/low), indicating no issues with data. don't see
No extreme (low or high) yield values were observed in data.

### Model fitting

Expand Down Expand Up @@ -125,7 +124,7 @@ tidy(model2_lme)

:::: column-margin
::: callout-note
Instead of `summary()` function, we used `tidy()` function from 'broom.mixed' package to get a short summary output of the model.
Instead of `summary()` function, we used `tidy()` function from the 'broom.mixed' package to get a short summary output of the model.
:::
::::

Expand Down Expand Up @@ -163,7 +162,7 @@ anova(model2_lme, type = "marginal")
```
:::

Let’s find estimates for some of the factors such as n, p, and n:k interaction. We will try the random intercept model first.
Let’s find estimates for some of the factors such as n, p, and n:k interaction effect. This will help us look at the combined effect of n & k on bean yield.

::: panel-tabset
### lme4
Expand All @@ -181,6 +180,6 @@ emmeans(model2_lme, specs = ~ n:k)
```
:::

2. Unbalanced factorial design
In summary, while working with factorial designs make sure to carefully interpret ANOVA and estimated marginal means for main and interaction effects.


46 changes: 42 additions & 4 deletions chapters/split-plot-design.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
source(here::here("settings.r"))
```

Split-plot design is frequently used for factorial experiments. Such design may incorporate one or more of the completely randomized (CRD), completely randomized block (RCBD), and Latin square designs. The main principle is that there are whole plots or whole units, to which the levels of one or more factors are applied. Thus each whole plot becomes a block for the subplot treatments.
Split-plot design is frequently used for factorial experiments. Such design may incorporate one or more of the completely randomized (CRD), completely randomized block (RCBD). The main principle is that there are whole plots or whole units, to which the levels of one or more factors are applied. Thus each whole plot becomes a block for the subplot treatments.

## Details for Split Plot Designs

Expand Down Expand Up @@ -126,7 +126,9 @@ The levels of whole plots and subplots are balanced.
str(height_data)
```

The 'time', 'manage', and 'rep' are character and variable height is numeric. The structure of the data is in format as needed. - Check the number of missing values in each column.
The 'time', 'manage', and 'rep' are character and variable height is numeric. The structure of the data is in format as needed.

- Check the number of missing values in each column.

```{r}
apply(height_data, 2, function(x) sum(is.na(x)))
Expand All @@ -139,6 +141,21 @@ ggplot(data = height_data, aes(y = height, x = time)) +
geom_boxplot(aes(fill = manage), alpha = 0.6)
```

Last, check the dependent variable by plotting a histogram of height data.
```{r, echo=FALSE}
#| label: fig-split_hist
#| fig-cap: "Histogram of the dependent variable."
#| column: margin
par(mar=c(5.1, 5, 2.1, 2.1))
hist(height_data$height, main = "", xlab = "yield", cex.lab = 1.8, cex.axis = 1.5)
```

```{r, eval=FALSE}
hist(height_data$height, main = "", xlab = "yield")
```

The distribution of height data looks close to normal.

#### Model building

::: column-margin
Expand Down Expand Up @@ -250,7 +267,7 @@ pairs(m2)
::: callout-note
## `pairs()`

The default p-value adjustment in `pairs()` function is "tukey", other options include “holm”, “hochberg”, “BH”, “BY”, and “none”. In addition, it's okay to use this function when independent variable has few factors (2-4). For variable with multiple levels, it's better to use custom contrasts. For more information on custom contrasts **please check this link**.
The default p-value adjustment in `pairs()` function is "tukey", other options include “holm”, “hochberg”, “BH”, “BY”, and “none”. In addition, it's okay to use this function when independent variable has few factors (2-4). For variable with multiple levels, it's better to use custom contrasts. For more information on custom contrasts please visit [**Chapter 12**](means-and-contrasts.qmd).
:::

### Example model for RCBD Split Plot Designs
Expand Down Expand Up @@ -290,6 +307,27 @@ Next, run the table() command to verify the levels of main-plots and sub-plots.
table(oats$V, oats$N)
```

- Check the number of missing values in each column.

```{r}
apply(oats, 2, function(x) sum(is.na(x)))
```

Last, check the dependent variable by plotting a histogram of yield data.
```{r, echo=FALSE}
#| label: fig-split-rcbd_hist
#| fig-cap: "Histogram of the dependent variable."
#| column: margin
par(mar=c(5.1, 5, 2.1, 2.1))
hist(oats$Y, main = "", xlab = "yield", cex.lab = 1.8, cex.axis = 1.5)
```

```{r, eval=FALSE}
hist(oats$Y, main = "", xlab = "yield")
```



#### Model Building the Model

We are evaluating the effect of V, N and their interaction on yield. The `1|B/V` implies that random intercepts vary with block and V within each block.
Expand Down Expand Up @@ -377,4 +415,4 @@ emm1
```
:::

In the next chapter we will continue with extension of split plot design called split-split plot design.
In the next chapter, we will continue with extension of split plot design called split-split plot design.
Loading

0 comments on commit 6e011b8

Please sign in to comment.