Chapter 6 Block Designs

Quite often we already know that experimental units are not homogeneous. Using a completely randomized design in such a situation would still be a valid procedure. However, making explicit use of the special “structure” of the experimental units typically helps reducing variance (“getting a more precise picture”). In your introductory course you have learned how to apply the paired \(t\)-test. It was used for situations where multiple treatments were applied on the same “object” or “subject”. Think for example of applying two treatments (in parallel) on human beings. We know that people can be (very) different. Due to the fact that we apply both treatments on the same subject, we get a “clear picture” within every subject (the difference between the two treatments). By taking the difference, the person-to-person variation automatically disappears. We also say that we “block” on persons.

We will now extend this to the \(g > 2\) situation where \(g\) is the number of levels of our treatment factor (as in Chapter 3).

6.1 Randomized Complete Block Designs

Assume that we can divide our experimental units into \(r\) groups, also known as blocks, containing \(g\) experimental units each. Think for example of an agricultural experiment at \(r\) different locations having \(g\) different plots of land each. Hence, a block is given by a location and an experimental unit by a plot of land.

The randomized complete block design (RCBD) uses a restricted randomization scheme: Within every block (e.g., location), the treatments are randomized to the experimental units (e.g., plots of land). The design is called complete because we see the complete set of treatments within every block (we will later also learn about incomplete block designs where this is not the case anymore). Note that blocking already exists at the time of randomization (and not only at the time of the analysis).

In the most basic form, we assume that we do not have replicates within a block. This means that we only see every treatment once in each block.

The analysis of a randomized complete block design is straightforward. We treat the block factor as “another” factor in our model. As we have no replicates within blocks, we can “only” fit a main effects model of the form \[ Y_{ij} = \mu + \alpha_i + \beta_j + \epsilon_{ij}, \] where \(\alpha_i\)’s are the treatment effects and \(\beta_j\) are the block effects with the usual side-constraints. In addition we have the usual assumptions on the error term \(\epsilon_{ij}\). According to this model we implicitly assume that blocks only cause additive shifts.

Let us now consider the hardness testing experiment from Montgomery (2012):

“For example, consider a hardness testing machine that presses a rod with a pointed tip into a metal specimen with a known force. By measuring the depth of the depression caused by the tip, the hardness of the specimen is determined. […] Suppose we wish to determine whether or not four different tips produce different readings on a hardness testing machine. The experimenter has decided to obtain four observations on Rockwell C-scale hardness for each tip. There is only one factor - tip type - and a completely randomized single-factor design would consist of randomly assigning each one of the \(4 \times 4 = 16\) runs to an experimental unit, that is, a metal coupon, and observing the hardness reading that results. Thus, 16 different metal test coupons would be required in this experiment, one for each run in the design. There is a potentially serious problem with a completely randomized experiment in this design situation. If the metal coupons differ slightly in their hardness, as might happen if they are taken from ingots that are produced in different heats, the experimental units (the coupons) will contribute to the variability observed in the hardness data. As a result, the experimental error will reflect both random error and variability between coupons. We would like to make the experimental error as small as possible; that is, we would like to remove the variability between coupons from the experimental error. A design that would accomplish this requires the experimenter to test each tip once on each of four coupons.”

This is a randomized complete block design. We now fit a main effects only model to this data in R and get the “usual” ANOVA table.

## Create data (skip if not interested) ####
tip    <- factor(rep(1:4, each = 4))
coupon <- factor(rep(1:4, times = 4))
y <- c(9.3, 9.4, 9.6, 10,
       9.4, 9.3, 9.8, 9.9,
       9.2, 9.4, 9.5, 9.7,
       9.7, 9.6, 10, 10.2)
hardness <- data.frame(y, tip, coupon)

## Analyze data ####
fit <- aov(y ~ coupon + tip, data = hardness)                 
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## coupon       3  0.825 0.27500   30.94 4.52e-05 ***
## tip          3  0.385 0.12833   14.44 0.000871 ***
## Residuals    9  0.080 0.00889                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We first focus on the p-value of tip. Clearly, we can reject the null hypothesis that there is no overall effect of tip type. Typically, we are not inspecting the p-value of the block factor coupon. There is some historic debate why we should not do this, mainly because of the fact that we did not randomize blocks because we already knew beforehand that blocks would show an effect. However, we can do a quick check to verify whether blocking was efficient or not. We would like the block factor to explain a lot of variation, hence if the mean squares of the block factor are much larger than the error mean square \(MS_E\) we would conclude that blocking was efficient. Here, this is the case as \(0.275 \gg 0.00889\).

Instead of a single treatment factor we can also have a factorial treatment structure within every block. Think for example of a two-factor factorial which we would model as Y ~ Block + A * B. Here, we could actually test the interaction between A and B even if every level combination of A and B appears only once in every block. As we have multiple blocks, we have multiple observations for every level combination of A and B!

6.2 Multiple Block Factors

We can also block on more than one factor. A special case is the so-called Latin Square Design where we have two block factors and one treatment factor having \(g\) levels each (yes, all!). This is very restrictive. Consider the following layout where we have a block factor with levels \(R_1\) to \(R_4\) (“rows”), another block factor with levels \(C_1\) to \(C_4\) (“columns”) and a treatment factor with levels \(A\) to \(D\).

In a Latin Square Design each treatment (the Latin letters) appears exactly once in each row and once in each column. We also say it is a so-called row-column designs.

\(C_1\) \(C_2\) \(C_3\) \(C_4\)
\(R_1\) \(A\) \(B\) \(C\) \(D\)
\(R_2\) \(B\) \(C\) \(D\) \(A\)
\(R_3\) \(C\) \(D\) \(A\) \(B\)
\(R_4\) \(D\) \(A\) \(B\) \(C\)

We can create a Latin Square Design in R for example with the function design.lsd of the add-on package agricolae (de Mendiburu 2020).

##      [,1] [,2] [,3] [,4]
## [1,] "A"  "C"  "B"  "D" 
## [2,] "C"  "A"  "D"  "B" 
## [3,] "D"  "B"  "A"  "C" 
## [4,] "B"  "D"  "C"  "A"

A Latin Square blocks on both rows and columns simultaneously. We can use the model \[ Y_{ijk} = \mu + \alpha_i + \beta_j + \gamma_k + \epsilon_{ijk}, \] to analyze data from a Latin square design. Here, \(\alpha_i\)’s are the treatment effects and \(\beta_j\) and \(\gamma_k\) are the block effects with the usual side-constraints.

The design is balanced having the effect that our usual estimators and sums of squares are “working”. In R we would use the model formula Y ~ Block1 + Block2 + Treat.


de Mendiburu, Felipe. 2020. Agricolae: Statistical Procedures for Agricultural Research.

Montgomery, D. C. 2012. Design and Analysis of Experiments. John Wiley & Sons, Incorporated.