Chapter 9 Incomplete Block Designs

9.1 Introduction

In this chapter we are going to learn more about block designs. The block designs we have seen so far were complete, meaning that every block containted all treatments. In practice, this is not always possible. For example, the physical size of a block might be too small. There are also situations where it is not advisable to have too many treatments in each block. If we do a food tasting experiment, we typically want to restrict the number of different “recipes” (treatments) we want to show to an individual rater. Otherwise, the response will get unreliable (because the rater might get bored after some time).

If we cannot “afford” to do a complete design we get a so-called incomplete block design (IBD). We have to decide what subset of treatments we use on an individual block. If we don’t make a wise decision, it can happen that the design is “destroyed” in the sense that certain quantities are not estimable anymore. This would be a worst case scenario.

Consider the following design with treatments \(A\) to \(F\) and blocks \(1\) to \(6\) (each column corresponds to a block).

1 2 3 4 5 6
\(A\) \(A\) \(B\) \(D\) \(D\) \(E\)
\(B\) \(C\) \(C\) \(E\) \(F\) \(F\)

E.g., on block 5 we apply the two treatments \(D\) and \(F\). First, we create a fictional data set.

set.seed(25) ## for reproducibility
dat <- data.frame(block = factor(rep(1:6, times = 2)),
                  treat = factor(c("A", "A", "B", "D", "D", "E",
                                   "B", "C", "C", "E", "F", "F")),
                  y     = rnorm(12)) ## no signal, just noise
str(dat)
## 'data.frame':    12 obs. of  3 variables:
##  $ block: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 4 5 6 1 2 3 4 ...
##  $ treat: Factor w/ 6 levels "A","B","C","D",..: 1 1 2 4 4 5 2 3 3 5 ...
##  $ y    : num  0.1217 0.2135 -0.0477 0.5461 -1.5689 ...

We can get an overview of the used combinations with xtabs.

xtabs(~ treat + block, data = dat)
##      block
## treat 1 2 3 4 5 6
##     A 1 1 0 0 0 0
##     B 1 0 1 0 0 0
##     C 0 1 1 0 0 0
##     D 0 0 0 1 1 0
##     E 0 0 0 1 0 1
##     F 0 0 0 0 1 1

What happens if we fit our standard main effects model here?

options(contrasts = c("contr.treatment", "contr.poly"))
fit <- aov(y ~ block + treat, data = dat)
dummy.coef(fit)
## Full coefficients are 
##                                                                       
## (Intercept):      0.307025                                            
## block:                   1          2          3          4          5
##                  0.0000000 -0.2788532 -1.1683252  1.8724763  0.2491680
## treat:                   A          B          C          D          E
##                  0.0000000  0.9989451  1.6586102 -1.8792448 -1.8273439
##                           
## (Intercept):              
## block:                   6
##                  2.2892593
## treat:                   F
##                  0.0000000

At first sight, it looks like we were able to fit the specified model. However, it is strange that two of the treatment coefficients are set to zero and not only one as usual when using contr.treatment.

What happens if we want to investige a certain contrast, say treatment \(A\) vs. \(F\)?

library(multcomp)
summary(glht(fit, linfct = mcp(treat = c(1, 0, 0, 0, 0, -1))))
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'summary': some linear functions are not estimable

We get an error message! The reason is that we cannot estimate the difference between treatments \(A\) and \(F\) with this data set (at least not with a fixed effects model). The reason lies in the “structure” of the incomplete design. It is an example of a so-called disconnected design. We have the treatment set \(\{A, B, C\}\) in the first three blocks and \(\{D, E, F\}\) in the last three blocks (check the output from xtabs above). As the two sets are disjoint, we have no idea about any differences involving treatments from the two sets because we never see a pair together in the same block. This is also causing the strange output in dummy.coef above. As we have that dependency between treatment and block, we cannot distinguish block and treatment effects anymore (confounding!). Here, R automatically drops one of the treatment levels out of the model (which might be dangerous if you are not carefully inspecting the output).

Hence, performing such a disconnected design was obviously not a good idea from a design of experiment point of view as we are not capable anymore to estimate differences involving treatments from the two disjoint sets. Intuitively, we should have a good “mix” of treatments in every block.

9.2 Balanced Incomplete Block Designs

We could simply randomize subsets of treatments to different blocks. This would work well if we have enough blocks. If we only have a small number of blocks, there would be slight chance that we end up with a disconnected design.

However, we can also try to fulfill some optimality criterion. One candidate would be the criterion that we can estimate all treatment differences with the same precision. For example, this would mean that all confidence intervals for \(\alpha_i - \alpha_j\) have the same width (for any pair \(i,j\)). The design which fulfills this criterion is the so-called balanced incomplete block design.

A balanced incomplete block design (BIBD) is an incomplete block design where all pairs of treatments occur together in the same block equally often (\(= \lambda\)).

We use the following notation:

  • \(g\): number of treatments (“number of different chocolate chip cookie brands”)
  • \(b\): number of blocks (“number of raters”)
  • \(k\): number of units per block (\(k < g\)) (“number of cookie brands each raters gets to see”)
  • \(r\): number of replicates per treatment (“how often do we see a certain cookie brand across all raters?”)
  • \(N\): total number of units

We have \(N = b \cdot k = g \cdot r\).

For every setting \(k < g\) we can find a BIBD by taking all possible subsets, whereof we have \(g \choose k\) (binomial coefficient, implemented in the function choose in R). The corresponding design is called an unreduced balanced incomplete block design. For example, if we have \(g = 6\) treatments and \(k = 3\) units per block we get \({6 \choose 3} = 20\) blocks. In R we can easily get this with the function combn.

combn(x = 6, m = 3)
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
## [1,]    1    1    1    1    1    1    1    1    1     1     2     2     2     2
## [2,]    2    2    2    2    3    3    3    4    4     5     3     3     3     4
## [3,]    3    4    5    6    4    5    6    5    6     6     4     5     6     5
##      [,15] [,16] [,17] [,18] [,19] [,20]
## [1,]     2     2     3     3     3     4
## [2,]     4     5     4     4     5     5
## [3,]     6     6     5     6     6     6

Every column defines a block in the matrix above. I.e., block 10 would get treatments 1, 5 and 6. Typically, we would randomize the order or the placement of the treatments within a block (as we should always do).

In practice we cannot always afford to do an unreduced BIBD as the required number of blocks might be too large. Whether a BIBD exists for a certain desired setting of number of blocks \(b\), block size \(k\) and number of treatments \(g\), is a combinatorial problem. A necessary (but not sufficient) condition for a BIBD to exist is \[ \frac{r \cdot (k - 1)}{g - 1} = \lambda. \] What is the intuition behind this formula? A treatment appears in a total of \(r\) different blocks. In each of these blocks, there are \(k - 1\) available other units, leading to a total of \(r \cdot (k - 1)\) available “units”. There are \(g - 1\) available (other) treatments that we must divide among them in order to have a balanced design. Hence, it must hold that \(r \cdot (k - 1) = \lambda \cdot (g - 1)\).

In R, package ibd provides some functionality to find (B)IBDs. For example, function bibd searches for a balanced incomplete block design.

library(ibd)
des.bibd <- bibd(v = 6, b = 10, r = 5, k = 3, lambda = 2) 
## arguments of function bibd are:
## - v: number of treatments
## - b: number of blocks
## - r: number of replicates (across all blocks)
## - k: number of units per block
## - lambda: lambda

des.bibd$design ## here, blocks are given by *rows*
##          [,1] [,2] [,3]
## Block-1     1    3    4
## Block-2     1    4    5
## Block-3     1    3    6
## Block-4     4    5    6
## Block-5     1    2    6
## Block-6     3    5    6
## Block-7     2    4    6
## Block-8     2    3    5
## Block-9     1    2    5
## Block-10    2    3    4

In des.bibd$design each row corresponds to a block.

For a setting where no BIBD is possible, we can try to find a design that is nearly balanced using the function ibd.

des.ibd <- ibd(v = 6, b = 9, k = 3)
## - v: number of treatments
## - b: number of blocks
## - k: number of units per block
des.ibd$design
##         [,1] [,2] [,3]
## Block-1    1    5    6
## Block-2    1    4    5
## Block-3    3    5    6
## Block-4    3    4    6
## Block-5    1    3    4
## Block-6    1    2    6
## Block-7    2    4    6
## Block-8    2    4    5
## Block-9    2    3    5

We can find the so-called concurrence matrix of the generated design in des.ibd$conc.mat. It contains information about how many times any pair of treatments appears together in the same block.

des.ibd$conc.mat ## check for (un)balancedness
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    4    1    1    2    2    2
## [2,]    1    4    1    2    2    2
## [3,]    1    1    4    2    2    2
## [4,]    2    2    2    5    2    2
## [5,]    2    2    2    2    5    2
## [6,]    2    2    2    2    2    5

On the diagonal we see how often every treatment appears across all blocks. In the off-diagonal elements we have the corresponding information about treatment pairs. For example, treatment pair 1 and 4 appear together twice in the same block, while 1 and 3 only once.

9.3 Analysis of (B)IBDs

The analysis of an incomplete block design is “as usual”. We use a block factor and a treatment factor leading to \[ Y_{ij} = \mu + \alpha_i + \beta_j + \epsilon_{ij}. \] As we are faced with an unbalanced design we typically use sum of squares for treatment effects that are adjusted for block effects.

We use the data set taste from package daewr as an example. Twelve different panelists rated four different recipes. We first load the data and try to get an overview.

data("taste", package = "daewr")
str(taste)
## 'data.frame':    24 obs. of  3 variables:
##  $ panelist: Factor w/ 12 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ recipe  : Factor w/ 4 levels "A","B","C","D": 1 1 1 2 2 3 1 1 1 2 ...
##  $ score   : num  5 7 5 6 6 8 6 5 4 7 ...
(tab <- xtabs(~ panelist + recipe, data = taste))
##         recipe
## panelist A B C D
##       1  1 1 0 0
##       2  1 0 1 0
##       3  1 0 0 1
##       4  0 1 1 0
##       5  0 1 0 1
##       6  0 0 1 1
##       7  1 1 0 0
##       8  1 0 1 0
##       9  1 0 0 1
##       10 0 1 1 0
##       11 0 1 0 1
##       12 0 0 1 1

The output of xtabs tells us that we have an incomplete design here. It is actually a balanced incomplete block design. We can test this with the function isGYD in package crossdes. For that reason, we first have to transform the design information to the desired form.

library(crossdes)
d <- t(apply(tab, 1, function(x) (1:4)[x != 0]))
isGYD(d)
## 
## [1] The design is a balanced incomplete block design w.r.t. rows.

We use the standard model from above. As it is unbalanced data, we use drop1 such that we get the sum of squares of recipe adjusted for panelist. We could also use summary here because recipe appears last in the model formula.

fit <- aov(score ~ panelist + recipe, data = taste)
drop1(fit, test = "F")
## Single term deletions
## 
## Model:
## score ~ panelist + recipe
##          Df Sum of Sq    RSS     AIC F value  Pr(>F)  
## <none>                 6.875 -0.0039                  
## panelist 11    7.4583 14.333 -4.3712  0.8876 0.58110  
## recipe    3    9.1250 16.000 14.2688  3.9818 0.04649 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The \(F\)-test of recipe is significant. For illustrational reasons we have a look at all pairwise comparisons without any multiple testing correction.

library(multcomp)
contr <- glht(fit, linfct = mcp(recipe = "Tukey"))
summary(contr, test = adjusted("none")) 
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = score ~ panelist + recipe, data = taste)
## 
## Linear Hypotheses:
##            Estimate Std. Error t value Pr(>|t|)  
## B - A == 0    0.750      0.618   1.214   0.2558  
## C - A == 0    1.375      0.618   2.225   0.0531 .
## D - A == 0   -0.625      0.618  -1.011   0.3383  
## C - B == 0    0.625      0.618   1.011   0.3383  
## D - B == 0   -1.375      0.618  -2.225   0.0531 .
## D - C == 0   -2.000      0.618  -3.236   0.0102 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- none method)

We are not inspecting the p-values here as it is a multiple testing problem. However, what we can observe is the fact that all differences share the same standard error (remember the motivation for a BIBD?).

Actually, the largest difference would “survive” a multiple testing correction using Tukey HSD.

summary(contr)
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = score ~ panelist + recipe, data = taste)
## 
## Linear Hypotheses:
##            Estimate Std. Error t value Pr(>|t|)  
## B - A == 0    0.750      0.618   1.214   0.6341  
## C - A == 0    1.375      0.618   2.225   0.1881  
## D - A == 0   -0.625      0.618  -1.011   0.7472  
## C - B == 0    0.625      0.618   1.011   0.7473  
## D - B == 0   -1.375      0.618  -2.225   0.1880  
## D - C == 0   -2.000      0.618  -3.236   0.0421 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

We see that treatment \(C\) is significantly different from treatment \(D\).