[R] simple test of lme, questions on DF corrections

John Maindonald john.maindonald at anu.edu.au
Tue Apr 1 03:21:30 CEST 2003

Dear Greg -
My understanding is that method="ML" changes only the
method used to fit the model.  The parameter estimates
are not the ML parameter estimates.  The fitted values
are the fitted values for ML.

The easiest way to get the anova sum of squares is to specify:
 > fm1.aov <- aov(travel~1+Error(factor(Rail)),data=Rail)
 > summary(fm1.aov)

Error: factor(Rail)
           Df Sum Sq Mean Sq F value Pr(>F)
Residuals  5 9310.5  1862.1

Error: Within
           Df  Sum Sq Mean Sq F value Pr(>F)
Residuals 12 194.000  16.167

This compares with:
 > fm1.lme <- lme(travel~1, random=~1|Rail,data=Rail,method="ML")
 > sum(fm1.lme$residuals[,2]^2)
[1] 195.0106
 > sqrt(195.0106/12)
[1] 4.031238

Consistent with this, the predicted values that are obtained with
 > predict(fm1.lme,level=1)
(the Best Linear Unbiased Predictors, or BLUPs)
are not the group means that you can get from:
 > predict(aov(travel~Rail,data=Rail))

Thus a residual mean square of 194/12 has become, in the ML fit,
195.0106/12.  This change in the predicted values (BLUPs), in the
residuals, and in the sum of squares of residuals, arises because
the likelihood is now being maximised for a model where there
are two variance parameters that must be estimated.  Notice that
the BLUPs are pulled back towards the overall mean, relative to
the group means.

NB also, specify level=1 to incorporate the random group (Rail)
effects into the predicted values.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Bioinformation Science, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

Date: Sat, 29 Mar 2003 20:19:33 -0500
From: Greg Hammett <hammett at princeton.edu>
Subject: [R] simple test of lme, questions on DF corrections
To: r-help at stat.math.ethz.ch
Message-ID: <200303300119.h2U1JXg21977 at hammett-xt.pppl.gov>
Content-Type: text/plain; charset=US-ASCII

I'm a physicist working on fusion energy and dabble in statistics
only occasionally, so please excuse gaps in my statistical
knowledge.  I'd appreciate any help that a real statistics expert
could provide.  Most people in my field do only very simple
statistics, and I am trying to extend some work on multivariate
linear regression to account for significant between-group
correlated errors.  It looks to me like the lme routine in the
nlme package does most of what I want. (My congrats to the
developers of R and nlme, they are extremely useful!).

To summarize my questions: I've tested lme on a very simple test
case (with just 1 parameter, the mean, and 1 random effect), but
the results are somewhat different than I get from some simple
maximum-likelhood formulas.  It appears that some quantities
calculated by lme are corrected for the reduction in the degrees
of freedom due to the number of fit parameters.  But these
corrections are slightly different (0.3%-3%) than what I would
have expected, and I'd like to understand these differences
better.  .....

More information about the R-help mailing list