Chapter 8 Split-Plot Designs

In this chapter we are going to learn something about experimental designs that contain experimental units of different “size”.

8.1 Introduction

We start with a fictional example, similar to the example in Section 16.1 of Oehlert (2010). Farmer John has eight different plots of land. He randomizes and applies two fertilization “schemes” (“control” and “new”) to the eight plots. In addition, each plot is divided into four subplots. In each plot, four different strawberry varieties are randomized to the subplots. John is interested in the effect of fertilization scheme and strawberry variety on fruit mass. Per subplot, he records the fruit mass after a certain amount of time. This means we have a total of \(8 \cdot 4 = 32\) observations (1 per subplot).

First we read the data.

john <- read.table("http://stat.ethz.ch/~meier/teaching/data/john.dat", header = TRUE)
john[, "plot"] <- factor(john[, "plot"])
str(john)
## 'data.frame':    32 obs. of  4 variables:
##  $ plot      : Factor w/ 8 levels "1","2","3","4",..: 7 7 7 7 5 5 5 5 6 6 ...
##  $ fertilizer: chr  "control" "control" "control" "control" ...
##  $ variety   : chr  "A" "B" "C" "D" ...
##  $ mass      : num  11.6 7.7 12 14 8.9 9.5 11.7 15 10.8 11 ...

Besides the information about the two treatment variables (fertilizer and variety) and the response mass we also have information about the corresponding plot of land in the variable plot. Note that we could not recover the randomization procedure from the data-frame alone. We really need the “story” from above.

We can (for example) visualize the data with an interaction plot.

with(john, interaction.plot(x.factor = variety, trace.factor = fertilizer, 
                            response = mass))

We observe that mass is larger on average with the “new” fertilization scheme. In addition, there might be a small variety effect. The interaction doesn’t seem to be very pronounced.

How can we model such data? To set up a correct model we have to follow the randomization procedure that was applied. There were two randomizations involved here:

  • fertilization schemes were randomized and applied to plots of land
  • strawberry varieties were randomized and applied to subplots.

Hence, an experimental unit for fertilizer is given by a plot of land, while for strawberry variety, the experimental unit is given by a subplot.

This design is an example of a split-plot design. Fertilizer is the so-called whole-plot factor and strawberry variety the split-plot factor. A whole-plot is given by a plot of land and a split-plot by a subplot of land.

As we have two different sizes of experimental units, we also need two error terms to model the corresponding experimental errors. We need one error term “acting” on the plot level and another one on the subplot level. Let \(Y_{ijk}\) be the mass of the \(k\)th replicate of a plot with fertilization scheme \(i\) and strawberry variety \(j\). We use the model \[ Y_{ijk} = \mu + \alpha_i + \eta_{k(i)} + \beta_j + (\alpha\beta)_{ij} + \epsilon_{ijk}, \] where \(\alpha_i\) is the fixed effect of fertilization scheme, \(\beta_j\) is the fixed effect of strawberry variety and \((\alpha\beta)_{ij}\) is the corresponding interaction term. We call \(\eta_{k(i)}\) the whole-plot error. The nesting notation ensures that we get a whole-plot error per plot of land (you can also think that the combination of fertilization scheme and replicate number identifies a plot). The mathematical notation is a bit cumbersome here, the model specification in R will be easier. At the end we have \(\epsilon_{ijk}\) which is the so-called split-plot error (the “usual” error). We have the assumptions \[ \eta_{k(i)} \, \textrm{ i.i.d.} \sim N(0, \sigma_{\eta}^2), \quad \epsilon_{ijk} \, \textrm{ i.i.d.} \sim N(0, \sigma^2). \] The part \[ \alpha_i + \eta_{k(i)} \] of the model formula can be thought of as the “reaction” of an individual plot on the \(i\)th fertilization scheme. All plot specific properties are included in the whole-plot error \(\eta_{k(i)}\). The fact that all subplots on the same plot share the same whole-plot error has the side-effect that observations from the same plot are modelled as correlated data. The following part \[ \beta_j + (\alpha\beta)_{ij} + \epsilon_{ijk} \] is the “reaction” of the subplot on the \(j\)th variety (including a potential interaction with the \(i\)th fertilization scheme). All subplot specific properties can now be found in the split-plot error \(\epsilon_{ijk}\).

If we only consider fertilization scheme, we do a completely randomized design here (with plots as experimental units). The first part of the model formula is actually the corresponding model equation. On the other side, if we only consider variety, we could treat the plots as blocks and would have a randomized complete block design on this “level” including an interaction term (this is what we see in the second part of the model formula).

To fit such a model in R we use a mixed model approach. The whole-plot error (acting on plots) can easily be incorporated with (1 | plot). The split-plot error (acting on the subplot level) is automatically included as it is on the level of individual observations. Hence, we end up with the following function call.

library(lmerTest)
fit <- lmer(mass ~ fertilizer * variety + (1 | plot), data = john)

Let us now have a look at the \(F\)-tests.

anova(fit)
## Type III Analysis of Variance Table with Satterthwaite's method
##                     Sum Sq Mean Sq NumDF DenDF F value    Pr(>F)    
## fertilizer         137.413 137.413     1     6 68.2395 0.0001702 ***
## variety             96.431  32.144     3    18 15.9627 2.594e-05 ***
## fertilizer:variety   4.173   1.391     3    18  0.6907 0.5695061    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The interaction is not significant while the two main effects are. Let us have a look at the denominator degrees of freedom for a better understanding. We see that for the test of fertilizer we only have six degrees of freedom. Why? Because as described above, we basically performed a completely randomized design with eight experimental units (the eight plots of land) and a treatment factor having two levels (“control” and “new”). Hence, the error has \(8 - 1 - 1 = 6\) degrees of freedom. Alternatively, we could also argue that we should check whether the variation between different fertilizer schemes is larger than the variation between plots getting the same fertilizer scheme. The latter is given by the whole-plot error having \(2 \cdot (4 - 1) = 6\) degrees of freedom.

8.2 Properties of Split-Plot Designs

Why should we do split-plot designs? Typically, split-plot designs are suitable for situations where one of the factors can only be varied on a “large” scale. E.g., fertilizer or irrigation on (large) plots of land. While “large” was literally large in the previous example, this is not always the case. Let us consider an example with a machine running under different settings using different source material. While it is easy to change the source material it is much more tedious to change the machine settings. Hence, we don’t want to change the machine settings too often. We could think of an experimental design where we change the machine setting and keep using the same setting for different source materials. This means we are not completely randomizing machine setting and source material. This would be another example of a split-plot design where machine settings is the whole-plot factor and source material is the split-plot factor.

The price that we pay for this “lazyness” on the whole-plot level is less precision (or less power) because we have much less observations on this level. We did not apply the (whole-plot) treatment very often, therefore we cannot expect our results to be very precise. This can also be observed in the ANOVA table above. The denominator degrees of freedom of fertilizer (the whole-plot factor) are only 6.

Split-plot designs can be found quite often in practice. Identifying a split-plot needs some experience. Often, a split-plot was not designed on purpose and hence the analysis does not take into account the special design structure (and is therefore wrong).

Split-plot designs can of course arise in much more complex situations. We could for example extend the original design in the sense that we do a randomized complete block design or a two-way factorial on the whole-plot level. If we have more than two factors we could also do a so-called split-split plot design having one additional “layer”, meaning that we would have three “sizes” of experimental units: whole plots, split plots and split-split plots. To specify the correct model we “simply” have to follow the randomization process. For every “size” of experimental unit we use a random effect as error term to model the experimental error. From a technical point of view it is often helpful to define “helper variables” which define the corresponding experimental units. E.g., we could have an extra variable PlotID which “enumerates” the different plots (as was the case with the original example).

Bibliography

Oehlert, Gary. 2010. A First Course in Design and Analysis of Experiments. http://users.stat.umn.edu/~gary/book/fcdae.pdf.