[R-sig-ME] Differences in degrees of freedom between a mixed-effects model and a gls model using nlme

Sun Feb 8 08:29:37 CET 2015

All 3 (paired t-test, mixed effect and gls with compound symmetry) are
fitting the same model, and so should give the same result. That is what
you see with the first example. The gls model is not getting it wrong
except for the df.

For the second the 3 model results should again be the same. I'm not
certain why but it may be numerical. Even though the data come from a model
that isn't correct for the fitting that should be irrelevant, it is the
data that produce the model fit not the model that produces the data.
Possibly estimates of the correlation are poor when there is little
correlation, and that flows through to the mixed effects and glass results.

The relationship to the unpaired t-test is probably irrelevant. Note also
that the default for the t.test is unequal variances whereas for a mixed
model it is equal variances.

The df for gls is obviously in a sense a bug. Getting the df for a mixed
model isn't easy. Here we have a nice simple correlation structure and
there is an obvious correct answer, but usually there isn't one. If the
model assumed uncorrelated data then the gls df would be correct, so it is
necessary for the software to work out what is going on. Using parametric
bootstrapping to determine the underlying distribution seems a better
method if accuracy is important.

Ken

On 6 February 2015 at 19:34, Hufthammer, Karl Ove <
karl.ove.hufthammer at helse-bergen.no> wrote:

> Dear list members,
>
> I'm having some problems understanding the difference in degrees of
> freedom between a mixed-effect model and a gls model, both fitted using the
> nlme package. I'm used to it being difficult to figure out the 'correct'
> (if such a thing exists) number of degrees of freedom in mixed-effects
> models, but in the following very simple example the *mixed-effects model*
> seems to use the correct degrees of freedom, while the gls model uses too
> many degrees of freedom.
>
> Here's the example. Basically, I want to test if a mixed-effects model is
> equivalent to a paired t-test.
>
> # Generate some correlated data
> set.seed(1)
> n = 10
> u = 5 + rnorm(n, sd=1)
> x1 = u + rnorm(n)
> x2 = u + rnorm(n) + 1
>
> # A paired t-test is, as expected, more powerful than an unpaired one
> t.test(x1, x2)              # p = .1407
> t.test(x1, x2, paired=TRUE) # p = .0597
>
> The relevant line of output for the paired test is:
> t = -2.1532, df = 9, p-value = 0.05972
>
> # Fit a linear mixed-effects model
> library(nlme)
> l = lme(values~ind, random=~1|id, data=df)
> summary(l)
>
> The relevant lines of output are:
>
>                Value Std.Error DF   t-value p-value
> indx2       0.617482 0.2867705  9  2.153226  0.0597
>
> So we get the same t-value (to five decimal places - it differs in the
> last decimal), degrees of freedom and p-value as in the paired t-test.
> Let's now try to fit a gls model for correlated data. This should in theory
> be equivalent to doing a paired t-test. We get:
>
> #  Fit a correlated gls model
> library(nlme)
> l2 = gls(values~ind, correlation=corSymm(form=~1|id), data=df)
> summary(l2)
>
>                Value Std.Error   t-value p-value
> indx2       0.617482 0.2867703  2.153228  0.0451
>
> The t-value is the same as in the t-test (here to *six* decimal place),
> but the p-value is too small. The reason is that the test uses twice the
> degrees of freedom that it should. It uses 18 degrees of freedom:
>
> 2*pt(2.153228, 18, lower.tail = FALSE) # 0.04510817
>
> So the mixed-effects model gets it right, while the gls model gets it
> wrong. But it's actually the *gls* model that should be equivalent to the
> paired t-test, not the mixed-effects model. We can easily see this by
> changing the correlation to be negative:
>
> set.seed(2)
> n = 10
> u = 5 + rnorm(n, sd=1)
> x1 = u + rnorm(n)
> x2 = 15 - 2*u + rnorm(n) + 1
> cor(x1, x2) # -0.50
>
> Unpaired t-test:
> t = -0.5052, df = 15.753, p-value = 0.6204
>
> Paired t-test (note that the p-value is *greater* than for the unpaired
> case):
> t = -0.418, df = 9, p-value = 0.6857
>
> Mixed-effects model:
>
>                Value Std.Error DF  t-value p-value
> indx2       0.481868 0.9537336  9 0.505244  0.6255
>
> Of course, the mixed-effects model doesn't really fit the data. The random
> effect variance is estimated to be ~0, so basically this corresponds to an
> *unpaired* t-test (but one that assumes equal variance). Note that the
> t-statistic is equal to the t-statistic of the unpaired t-test (though the
> latter uses 18 degrees of freedom, while this model uses 9 degrees of
> freedom).
>
> Now, if I've understood everything correctly, the gls model should
> *exactly* correspond to a paired t-test; it's the same underlying model.
> And indeed we get the same t-statistic (0.418):
>
>                Value Std.Error  t-value p-value
> indx2       0.481868 1.1526947 0.418036  0.6809
>
> But the p-value differs somewhat, since the gls model assumes 18 degrees
> of freedom, while the t-test (correctly) assumes 9 degrees of freedom.
>
> At least for positive correlations, a mixed-effect random intercept model
> and a gls model with compound symmetry should be *almost* equivalent, so I
> see no reason that the gls model can assume that the test statistics has
> twice as many degrees of freedom as in the mixed-effects model. So why the
> discrepancy? Shouldn't I trust the degrees of freedom calculations from
> gls()?
>
> --
> Karl Ove Hufthammer
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

-- 

*Ken Beath*
Lecturer
Statistics Department
MACQUARIE UNIVERSITY NSW 2109, Australia

Phone: +61 (0)2 9850 8516

Building E4A, room 526
http://stat.mq.edu.au/our_staff/staff_-_alphabetical/staff/beath,_ken/

CRICOS Provider No 00002J
This message is intended for the addressee named and may...{{dropped:9}}