[R] Interpretation of hypothesis tests for mixed models

Mon Dec 16 16:47:03 CET 2002

Olof Leimar <Olof.Leimar at zoologi.su.se> writes:

> Thanks for setting me straight about the model
> 
> > fm1 <- lme(y ~ Trt, random = list(Subj = pdCompSymm(~ Trt - 1)))
> 
> being the one that is equivalent to
> 
> > fm2 <- lme(y ~ Trt, random = ~ 1 | Subj/Trt)
> 
> It seems that denDF of a fixed effect test for treatment should also be
> the same for fm1 and fm2. Is it possible to modify the method of
> computing denDF in nlme to achive this? 

That would require a complete redesign and rewrite of the
corresponding part of the nlme package.  As described on p. 91 of
Pinheiro and Bates (2000) the denominator degrees of freedom are
calculated according to the number observations and the number of
groups at each level of random effects.  For fm2 there are three
choices while for fm1 there are only two.

It happens that the models are equivalent but discovering that
equivalence within the model-fitting function would be extremely
difficult.  The different formulations will result in different
denominator degrees of freedom in the current formulation.
Contributions of code that uses alternative formulations are welcome.
For example SAS PROC MIXED has both containment and Satterthwaite
options.

> Meanwhile, my understanding is that fm2 is to be preferred over fm1. I
> did simulations for fm1/fm2 with no true (fixed) difference between
> treatments, which seemed to show that a test with the fm1 formulation
> can sometimes produce considerably more statistical significances than
> would be warranted. 

I don't understand this.  In my previous reply I showed an example
using the Machines data.  I reproduce it here with the numbering you
use (fm1 is the model with pdCompSymm and one level of random effects,
fm2 uses nested random effects)

> library(nlme)
Loading required package: nls 
Loading required package: lattice 
> data(Machines)
> fm1 <- lme(score ~ Machine, data = Machines, 
+            random = list(Worker = pdCompSymm(~ Machine - 1)))
> fm2 <- lme(score ~ Machine, data = Machines, random = ~ 1 | Worker/Machine)
> summary(fm1)$tTable
                Value Std.Error DF   t-value      p-value
(Intercept) 52.355556  2.485416 46 21.065106 2.903585e-25
MachineB     7.966667  2.177416 46  3.658770 6.505878e-04
MachineC    13.916667  2.177416 46  6.391367 7.483288e-08
> summary(fm2)$tTable
                Value Std.Error DF   t-value      p-value
(Intercept) 52.355556  2.485829 36 21.061606 7.844348e-22
MachineB     7.966667  2.176974 10  3.659514 4.392617e-03
MachineC    13.916667  2.176974 10  6.392665 7.906445e-05

In these results the estimates and standard errors are the same but
the denominator degrees of freedom in fm2 are smaller than those in
fm1.  That would always be the case so the F- and t-tests from fm2
would be more conservative than those from fm1.

> I then have another question. How should I go about formulating a model
> corresponding to the nesting in fm2 if instead of a treatment factor I
> have a covariate? Since in my example Trt was a two-level factor, one
> could for instance let the levels be zero and one and regard the
> treatment as a covariate. If I express the treatment as a covariate x
> and fit
> 
> > fm4 <- lme(y ~ x, random = ~ 1 | Subj/x)
> 
> I get the same denDF as for fm2, but for a general covariate (with more
> than two values) denDF depends on the number of distinct values taken by
> the covariate (but it should not, should it?). 

I'm sorry but I have no idea what you are trying to do.  If x is a
covariate the only models that would make sense to me are

 lme(y ~ x, random = ~ 1 | Subj)

and 

 lme(y ~ x, random = ~ x | Subj)