# 8 Incomplete Block Designs

## 8.1 Introduction

In this chapter we are going to learn more about block designs. The block
designs in Chapter 5 were *complete*, meaning that every
block contained *all* treatments. In practice, this is not always possible. For
example, the actual physical size of a block might be too small. There are also
situations where it is not advisable to have too many treatments in each block.
If we do a food tasting experiment, we typically want to restrict the number of
different recipes (treatments) we want an individual rater to taste.
Otherwise, the response will get unreliable, for example, because the raters
might get bored after some time.

If we cannot “afford” to do a complete design, we get a so-called **incomplete
block design (IBD)**. We have to decide what *subset* of treatments we use in an individual
block. If we do not make a wise decision, it can be that the design is flawed
in the sense that certain quantities are not estimable anymore (e.g., some
treatment differences). This would be a worst case scenario.

Consider the design in Table 8.1 with treatments \(A\) to \(F\)
and blocks \(1\) to \(6\) (each *column* corresponds to a block). For example, on block 5
we apply the two treatments \(D\) and \(F\). Think for example of treatments as
different recipes and block as different raters.

1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|

\(A\) | \(A\) | \(B\) | \(D\) | \(D\) | \(E\) |

\(B\) | \(C\) | \(C\) | \(E\) | \(F\) | \(F\) |

First, we create a fictional data set having the same structure as in Table 8.1.

```
set.seed(25) ## for reproducibility
dat <- data.frame(block = factor(rep(1:6, times = 2)),
treat = factor(c("A", "A", "B", "D", "D", "E",
"B", "C", "C", "E", "F", "F")),
## block eff. + error term, no signal wrt. treat
y = round(rep(rnorm(6), times = 2) +
rnorm(12), 1))
str(dat)
```

```
## 'data.frame': 12 obs. of 3 variables:
## $ block: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 4 5 6 ...
## $ treat: Factor w/ 6 levels "A","B","C","D",..: 1 1 2 4 4 5 ...
## $ y : num 1.6 1.7 0.9 0.7 -0.8 3.4 ...
```

We can get an overview of the used combinations with `xtabs`

.

`xtabs(~ treat + block, data = dat)`

```
## block
## treat 1 2 3 4 5 6
## A 1 1 0 0 0 0
## B 1 0 1 0 0 0
## C 0 1 1 0 0 0
## D 0 0 0 1 1 0
## E 0 0 0 1 0 1
## F 0 0 0 0 1 1
```

What happens if we fit our standard main effects model here?

```
options(contrasts = c("contr.treatment", "contr.poly"))
fit.ibd <- aov(y ~ block + treat, data = dat)
dummy.coef(fit.ibd)
```

```
## Full coefficients are
##
## (Intercept): 1.1
## block: 1 2 3 4 5 6
## 0.0 1.1 0.5 -2.9 -4.0 1.1
## treat: A B C D E F
## 0.0 -1.2 -1.8 2.3 1.0 0.0
```

At first sight, it looks like we were able to fit the specified model.
However, it is strange that *two* of the treatment coefficients are set
to zero and not only *one* as usual when using `contr.treatment`

.

What happens if we want to investigate a certain contrast, say treatment \(A\) vs. \(F\)?

We get an error message (not shown) which says that some linear functions are
not estimable. In fact, we *cannot* estimate the difference between the
treatments \(A\) and \(F\) with this data set, at least not with a fixed effects
model. The reason lies in the structure of the incomplete design. It is an
example of a so-called **disconnected design**. We
have the treatment set \(\{A, B, C\}\) in the first three blocks and \(\{D, E, F\}\)
in the last three blocks (check the output of `xtabs`

above). As the two sets
are disjoint, we have no idea about any differences involving treatments from
the two sets, because we never see a pair together in the same block. This is
also causing the strange output in `dummy.coef`

above. Because of this
dependency between treatment and block, we cannot distinguish block and
treatment effects anymore; the effects are confounded. Here, `R`

automatically
drops one of the treatment levels from the model, which might be very dangerous
if you are not carefully inspecting the output.

Hence, performing such a disconnected design was obviously not a good idea from a design of experiment point of view, as we cannot estimate any differences involving treatments from the two disjoint sets. Still, situations like these happen in practice: “Let’s do treatments \(A\), \(B\) on the first few days and then switch to \(C\), \(D\) because it is easier …”. If day is a block factor, this design will be disconnected.

Intuitively, we should have a good “mix” of treatments in each block.

## 8.2 Balanced Incomplete Block Designs

We could simply randomize subsets of treatments to different blocks. This would work well if we have enough blocks. However, if we only have a small number of blocks, there would be the risk that we end up with a disconnected design.

On the other hand, we can also try to fulfill some optimality criterion. One
candidate would be the criterion that we can estimate all treatment differences
with the **same precision**. For example, this would mean that all confidence
intervals for \(\alpha_i - \alpha_j\) have the same width (for *any* pair \(i,j\),
where \(\alpha_i\) is the \(i\)th treatment effect). The design which fulfills this
criterion is the so-called balanced incomplete block design.

A **balanced incomplete block design (BIBD)** is an *incomplete* block
design where all *pairs* of treatments occur together in the *same* block
equally often; we denote this number by \(\lambda\).

Using an example where blocks are given by raters and treatments by different chocolate chip cookie brands, we introduce the following notation, similar to Oehlert (2000):

- \(g\): number of treatments (“number of different chocolate chip cookie brands”)
- \(b\): number of blocks (“number of raters”)
- \(k\): block size; number of experimental units per block (\(k < g\)) (“number of cookie brands each rater gets to taste”)
- \(r\): number of replicates per treatment (“how often do we observe a certain cookie brand across all raters?”)
- \(N\): total number of experimental units

By definition, we have \(N = b \cdot k = g \cdot r\). How can we construct a BIBD?
For every setting \(k < g\) we can find a BIBD by taking *all* possible subsets,
where we have \(g \choose k\) (binomial coefficient, implemented in the function
`choose`

in `R`

). The corresponding design is called an **unreduced balanced
incomplete block design**. For
example, if we have \(g = 6\) treatments and \(k = 3\) experimental units per block,
we get \({6 \choose 3} = 20\) blocks. In `R`

, we can easily get this with the
function `combn`

.

`combn(x = 6, m = 3)`

```
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] 1 1 1 1 1 1 1 1 1 1 2 2 2
## [2,] 2 2 2 2 3 3 3 4 4 5 3 3 3
## [3,] 3 4 5 6 4 5 6 5 6 6 4 5 6
## [,14] [,15] [,16] [,17] [,18] [,19] [,20]
## [1,] 2 2 2 3 3 3 4
## [2,] 4 4 5 4 4 5 5
## [3,] 5 6 6 5 6 6 6
```

In the matrix above, every column defines a block. For example, block 10 would get treatments 1, 5 and 6. Typically, we would randomize the order or the placement of the treatments within a block, as we should always do.

In practice, we cannot always afford to do an unreduced BIBD, as the required number of blocks might be too large, for example, if we do not have enough raters. Whether a BIBD exists for a certain desired setting of number of blocks \(b\), block size \(k\) and number of treatments \(g\), is a combinatorial problem. A necessary, but not sufficient, condition for a BIBD to exist is \[\begin{equation} \frac{r \cdot (k - 1)}{g - 1} = \lambda, \tag{8.1} \end{equation}\] where \(\lambda\) is the number of times two treatments occur together in the same block (hence, an integer).

What is the intuition behind Equation (8.1)? A treatment appears in a total of \(r\) different blocks. In each of these blocks, there are \(k - 1\) available other experimental units, leading to a total of \(r \cdot (k - 1)\) available experimental units. There are \(g - 1\) other available treatments that we must divide among them. Hence, it must hold that \(r \cdot (k - 1) = \lambda \cdot (g - 1)\); otherwise, it is impossible to construct a BIBD. As already mentioned above, this condition is necessary, but not sufficient. This means that even if the condition is fulfilled, it might be impossible to construct a BIBD.

In `R`

, package `ibd`

(Mandal 2019) provides some functionality to find (B)IBDs. For
example, function `bibd`

searches for a balanced incomplete block design using
sophisticated numerical optimization techniques.

```
library(ibd)
des.bibd <- bibd(v = 6, b = 10, r = 5, k = 3, lambda = 2)
## arguments of function bibd are:
## - v: number of treatments
## - b: number of blocks
## - r: number of replicates (across all blocks)
## - k: number of experimental units per block
## - lambda: lambda
des.bibd$design ## here, blocks are given by *rows*
```

```
## [,1] [,2] [,3]
## Block-1 2 3 6
## Block-2 2 5 6
## Block-3 4 5 6
## Block-4 3 4 5
## Block-5 1 4 6
## Block-6 1 3 6
## Block-7 1 3 5
## Block-8 1 2 5
## Block-9 2 3 4
## Block-10 1 2 4
```

In `des.bibd$design`

, each *row* corresponds to a block. For a setting where no
BIBD is possible, we can try to find a design that is nearly balanced using the
function `ibd`

(function `bibd`

would return `"parameters do not satisfy necessary conditions"`

in such a situation).

```
des.ibd <- ibd(v = 6, b = 9, k = 3)
## - v: number of treatments
## - b: number of blocks
## - k: number of units per block
des.ibd$design
```

```
## [,1] [,2] [,3]
## Block-1 1 5 6
## Block-2 1 4 5
## Block-3 3 5 6
## Block-4 3 4 6
## Block-5 1 3 4
## Block-6 1 2 6
## Block-7 2 4 6
## Block-8 2 4 5
## Block-9 2 3 5
```

We can find the so-called concurrence matrix of the generated design in
`des.ibd$conc.mat`

. It contains information about how many times any pair of
treatments appears together in the same block.

`des.ibd$conc.mat ## check for (un)balancedness`

```
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 4 1 1 2 2 2
## [2,] 1 4 1 2 2 2
## [3,] 1 1 4 2 2 2
## [4,] 2 2 2 5 2 2
## [5,] 2 2 2 2 5 2
## [6,] 2 2 2 2 2 5
```

On the diagonal, we can read off how often every treatment appears across all blocks. In the off-diagonal elements, we have the corresponding information about treatment pairs. For example, treatment 1 and 4 appear together twice in the same block. On the other hand, treatment 1 and 3 appear together only once.

For a given design, we would then randomize the different treatment subsets, i.e.,
the different rows in `des.ibd$design`

, to the actual blocks, e.g.,
raters. Within each block, we would randomize the corresponding treatments to
the experimental units, e.g., time-slots, and last but not least we would
globally randomize the treatment numbers to the actual treatments, e.g.,
cookie brands.

## 8.3 Analysis of Incomplete Block Designs

### 8.3.1 Example: Taste Data

The analysis of an incomplete block design is “as usual.” We use a fixed block
factor and a treatment factor leading to
\[\begin{equation}
Y_{ij} = \mu + \alpha_i + \beta_j + \epsilon_{ij},
\tag{8.2}
\end{equation}\]
where the \(\alpha_i\)’s are the treatment effects and the \(\beta_j\)’s are the
block effects with the usual side constraints. Because we do not observe all the
block and treatment combinations equally often (some are simply missing), we are
faced with an *unbalanced* design. We typically use sum of squares for treatment
effects that are adjusted for block effects.

Let us consider the data set `taste`

from package `daewr`

(John Lawson and Krennrich 2021) as an
example. Twelve different panelists each rated two out of four different
recipes. We first load the data and try to get an overview.

```
## 'data.frame': 24 obs. of 3 variables:
## $ panelist: Factor w/ 12 levels "1","2","3","4",..: 1 2 3 4 5 6 ...
## $ recipe : Factor w/ 4 levels "A","B","C","D": 1 1 1 2 2 3 ...
## $ score : num 5 7 5 6 6 8 ...
```

`(tab <- xtabs(~ recipe + panelist, data = taste))`

```
## panelist
## recipe 1 2 3 4 5 6 7 8 9 10 11 12
## A 1 1 1 0 0 0 1 1 1 0 0 0
## B 1 0 0 1 1 0 1 0 0 1 1 0
## C 0 1 0 1 0 1 0 1 0 1 0 1
## D 0 0 1 0 1 1 0 0 1 0 1 1
```

The output of `xtabs`

tells us that we have an *incomplete* design here. It is
actually a balanced incomplete block design. We can test this with the function
`isGYD`

in package `crossdes`

(Sailer 2013). For that reason, we first have to
transform the design information to the desired form such that we have a matrix
with 12 rows (panelists) and two columns containing the two treatments. This
can be done with the following rather technical code where we extract the
corresponding treatments column-wise.

```
##
## panelist [,1] [,2]
## 1 1 2
## 2 1 3
## 3 1 4
## 4 2 3
## 5 2 4
## 6 3 4
## 7 1 2
## 8 1 3
## 9 1 4
## 10 2 3
## 11 2 4
## 12 3 4
```

`isGYD(d)`

```
##
## [1] The design is a balanced incomplete block design w.r.t. rows.
```

We use the model in Equation (8.2). As it is an unbalanced data set, we
use `drop1`

such that we get the sum of squares of `recipe`

adjusted for
`panelist`

. We could also use `summary`

here because `recipe`

appears last in
the model formula.

```
## Single term deletions
##
## Model:
## score ~ panelist + recipe
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 6.87 0.00
## panelist 11 7.46 14.33 -4.37 0.89 0.581
## recipe 3 9.13 16.00 14.27 3.98 0.046
```

The \(F\)-test of recipe is significant. To illustrate, we have a look at all pairwise comparisons without any multiple testing correction.

```
library(multcomp)
contr <- glht(fit.taste, linfct = mcp(recipe = "Tukey"))
summary(contr, test = adjusted("none"))
```

```
##
## Simultaneous Tests for General Linear Hypotheses
##
## Multiple Comparisons of Means: Tukey Contrasts
## ...
## Linear Hypotheses:
## Estimate Std. Error t value Pr(>|t|)
## B - A == 0 0.750 0.618 1.21 0.256
## C - A == 0 1.375 0.618 2.22 0.053
## D - A == 0 -0.625 0.618 -1.01 0.338
## C - B == 0 0.625 0.618 1.01 0.338
## D - B == 0 -1.375 0.618 -2.22 0.053
## D - C == 0 -2.000 0.618 -3.24 0.010
## (Adjusted p values reported -- none method)
```

We are not inspecting the p-values here, as it is a multiple testing
problem. However, what we can observe is the fact that *all* differences
share the *same* standard error (remember the motivation for a BIBD?). This is
a feature of this special design!

### 8.3.2 Intra- and Inter-block Analysis

Up until now, we estimated treatment effects by adjusting for block effects. This
means that whatever is special to a block is fully allocated to the block effect and
does *not* affect the treatment effect. Basically, the estimate of the treatment
effect is based on the “leftovers.” This is also known as an
**intra-block analysis**.

On the other hand, if we treat the block factor as a random effect, the mean of
the values of a block implicitly also contain information about the treatment
effects. An analysis which is based on this information is known as an
**inter-block analysis**. This leads to another
estimate of the treatment effects. Both approaches can be combined. In practice,
this can be achieved by using `lmer`

and setting the block factor as a random
effect. Typically, there is only little benefit; see also the discussion in
section 14.1.2 of Oehlert (2000).

However, there are also situations where the effect can be dramatic. If we consider the introductory example with the disconnected design, some magic happens:

```
library(lmerTest)
fit.ibd2 <- lmer(y ~ treat + (1 | block), data = dat)
summary(glht(fit.ibd2, linfct = mcp(treat = c(1, 0, 0, 0, 0, -1))))
```

```
##
## Simultaneous Tests for General Linear Hypotheses
##
## Multiple Comparisons of Means: User-defined Contrasts
## ...
## Linear Hypotheses:
## Estimate Std. Error z value Pr(>|z|)
## 1 == 0 2.39 1.77 1.35 0.18
## (Adjusted p values reported -- single-step method)
```

Here it is: an estimate of the difference between \(A\) and \(F\,\)! However, this is standing on very thin ice and you should not rely on this trick in practice!

## 8.4 Outlook

A balanced incomplete block design is of course only a very special case of an
incomplete block design. A more general version is a so-called **partially
balanced incomplete block design (PBID)** where some treatment pairs occur
together more often than other pairs (defined through so-called associate
classes). Hence, not all treatment differences are estimated with the same
precision. This is useful if more precision is needed for comparisons with a
control treatment than for comparisons between the other treatments. See for
example Kuehl (2000) and Oehlert (2000) for more details.

With two block factors we can also create block designs which are incomplete in one or both “directions,” so-called row-column designs, see also Section 5.4. An example would be a design where the rows form a complete block design but the columns an incomplete block design. Again, more details can be found in Kuehl (2000) and Oehlert (2000).

## 8.5 Concluding Remarks

We have now learned about some of the more important experimental designs and
the corresponding statistical analyses. The last few chapters were mostly in the
form of short introductions (as it says in the book title) and more information can
be found in the references mentioned in the corresponding chapters. A good place
to start looking for more information about `R`

packages are the CRAN task
views which provide a nice overview of
packages that are suitable for special topics. There is a task view “Design of
Experiments (DoE) & Analysis of Experimental
Data” which
contains some of the packages that we already used in this book, but of course
also many more.