[R-sig-ME] Gaussian linear mixed model with non-constant variance

Tue Jan 30 05:16:41 CET 2018

Hello, Mixed Modellers.

I’m after some advice on incorporating non-constant variances into a Gaussian linear mixed model.

Here’s an outline of the experimental setup:

Eight experimental units are set up and each is randomly assigned a different treatment, namely, Control, A, B, C, D, E, F and G. Three subsamples are taken together from each experimental unit and the response variable measured for each subsample. The response variable is continuous and a model with the Normal assumption will be used. This results in 24 measurements. This entire process is repeated two more times, each with a new set of experimental units. The resulting data set has three sets of measurements at each level of the treatment, with nine observed values for each level.

The experimental design is, then, a randomised complete block design. There are no missing values.

The researchers are interested in the (fixed) effect of the treatment and, in particular, they want to compare the effect of treatments A to G relative to the control.

The data set is small but, ignoring that, I want to a allow for non-constant variance. I’m using the nlme package.

In the code below, “obs” is the measured response, “treatment” is a factorial treatment identfier, “expt” is experimental run and “unit” is experimental unit. “expt” is factorial and coded 1 to 3. “unit” is factorial and coded 1 to 24. The model syntax I’ve used, assuming constant variance, is:

Model 1: Y1.lme1 <- lme(obs ~ treatment, random = ~ 1 | expt / unit, data = Y1)

A residual vs. fitted value plot from the above model shows non-constant variance.

To allow for non-constant variance across experimental runs, I’ve fitted the model below:

Model 2: Y1.lme2 <- lme(obs ~ treatment, random = ~ 1 | expt / unit, weights = varIdent(form = ~ 1 | expt), data = Y1)

Taking an alternative tack, I averaged the observed subsampled values for each experimental unit, giving a data set with 24 observed (average) values. I then fitted similar models to the two above.

Model 3: Y1.lme3 <- lme(obs.avg ~ treatment, random = ~ 1 | expt, data = Y1.avg)

A residual vs. fitted value plot from the above model shows non-constant variance and I again allowed for variances to change across experimental runs in the model below.

Model 4: Y1.lme4 <- lme(obs.avg ~ treatment, random = ~ 1 | expt, weights = varIdent(form = ~ 1 | expt), data = Y1.avg)

The fixed effect parameter estimates and their standard errors from Models 1 and 3 are exactly the same.

However, the fixed effect parameter estimates and their standard errors from Models 2 and 4 are quite different.

I'd be very grateful to get some comments on the following three points:

(1) Given the equivalence of models 1 and 3, I've been thinking the variation at the subsample level doesn't need to be incorporated into a model with changing variances across experimental run.

(2) I think my model syntax for "weights" in model 2 is incorparating subsample-level variation when estimating variances (and if I'm right (!) in point (1) above, this isn't the way to go).

(3) What I need to know in the end is which of models 2 and 4 (or something with different syntax?) to use to correctly model changing variances across experimental run.

Thanks in advance,

Peter