[R] simple test of lme, questions on DF corrections

Sun Mar 30 03:19:33 CEST 2003

I'm a physicist working on fusion energy and dabble in statistics
only occasionally, so please excuse gaps in my statistical
knowledge.  I'd appreciate any help that a real statistics expert
could provide.  Most people in my field do only very simple
statistics, and I am trying to extend some work on multivariate
linear regression to account for significant between-group
correlated errors.  It looks to me like the lme routine in the
nlme package does most of what I want. (My congrats to the
developers of R and nlme, they are extremely useful!).

To summarize my questions: I've tested lme on a very simple test
case (with just 1 parameter, the mean, and 1 random effect), but
the results are somewhat different than I get from some simple
maximum-likelhood formulas.  It appears that some quantities
calculated by lme are corrected for the reduction in the degrees
of freedom due to the number of fit parameters.  But these
corrections are slightly different (0.3%-3%) than what I would
have expected, and I'd like to understand these differences
better.  The transcript of an R session documenting these issues
is at the end of this message.

--------------------------------------------

Details:

I've read relevant parts of Pinheiro and Bates's book and various
articles.  To test my understanding of various definitions, I
considered the simplest limit of fitting a 1-parameter model
(i.e., just the mean or intercept) with 1 level of grouping:

y_ij = beta + b_i + e_ij

where beta is the population mean, b_i is the random effect for
the i'th group, and e_ij is the random error of the j'th
observation from the i'th group.  I'm using the simple "Rail"
dataset of N=18 observations, consisting of 3 measurements of
ultrasound wave travel time for each of N_groups=6 rails, as used
in the examples on p. 5-11 of Pinheiro & Bates.  I am using the
method="ML" option in lme to use the Maximum Likelihood method.

In this simple limit, the expressions for the maximum likelihood
formulas (such as Eq. 2.6 of P&B) greatly simplify, as the
relevant variance-covariance matrices simplify to scalar
numbers.  One can then analytically calculate the
maximum-likelihood results for various quantities, getting the
standard-looking results that the ML estimate of the within-group
random error is

sigma^2 = sum(e_ij^2)/N       Eq. g.1

and the ML estimate of the between-group random error is:

sigma_b^2 = sum(b.hat_i^2)/N_groups       Eq. g.2

(where b.hat_i are the fitted estimates of b_i, and e_ij are the
residuals after fitting).

Note that these formulas are slightly different than the standard
unbiased least-squares results, which divide by the number of
degrees of freedom corrected for the number of fit parameters.
As usual, maximum likelihood ML estimates are slightly downward
biased because of this.  [As I understand it, REML is designed to
correct for this, and does so very well in balanced designs.  But
for the eventual fusion problem I'm interested in, the dataset is
very unbalanced and lme with method="REML" appears to give
excessively pessimistic error estimates, so I'm using ML instead,
which gives results consistent with a couple of other methods
I've used: cross-validation and a weighted jackknife.]

However, from the test I do with the Rail dataset below, it
appears that lme actually doesn't use the above formula for sigma
but instead includes some kind of correction for the reduced
degrees of freedom.  lme's result is very close to:

sigma^2 = sum(e_ij^2)/(N-N_groups)

lme's sigma is only 0.3% low compared to this, so I don't worry
about this much, but I'd still like to understand how lme is
calculating sigma and why it differs slightly even from this
result.

lme's calculation of sigma_b is very close to Eq. g.2 above (and
so doesn't appear to include a DF correction?), but is 0.5% high,
just enough different to be a little puzzling.

Does anyone know of a simple explanation of these small
differences?  I could "just read the source code" (I have glanced
at it), but I assume it is all cast in terms of the general
matrix formulation and the degrees-of-freedom (DF) correction is
buried somewhere hard to understand.  I haven't noticed a mention
in any of the documentation that some of the results from
lme(method="ML") attempt to correct for the O(p/n) bias of
standard maximum-likelihood estimates.  Does anyone know of any
documentation or references regarding this?

Finally, if the within-group rms error sigma and the
between-group rms error sigma_b have been estimated, then a
standard calculation seems to show that the uncertainty
sigma_beta in the estimated population mean beta is:

sigma_beta^2 = sigma^2/N + sigma_b^2/N_groups

The value of sigma_beta reported by lme is 3% higher than this
formula.  I could imagine that a DF correction to a ML estimate
gives small differences from this formula for some reason, but
I'd like to understand it better.   In the Rail data, the
uncertainty sigma_beta is dominated by the between-group error
sigma_b, so in the 95% confidence interval I might have thought
one would use a t-statistic with ~N_groups degrees of freedom,
but it appears that the 95% confidence interval caclulated by
interval() is using DF~=N-2.  Any comments?

Below is a transcript of an R session documenting the various
issues mentioned about.  Perhaps I'm just ignorant of some
standard corrections that experts assume every one knows, and I'd
appreciate any feedback.

Thanks,

Greg

Greg Hammett
Lecturer with rank of Professor, 
   Program in Plasma Physics, Princeton University
Principal Research Physicist, 
   Princeton Plasma Physics Laboratory

###################################################################

R : Copyright 2003, The R Development Core Team
Version 1.6.2  (2003-01-10)
> #
> # Test lme using Rail dataset from p.9-10 of Pinheiro and Bates Book.
> #
> 
> library(nlme)
Loading required package: nls 
Loading required package: lattice 
> data(Rail)
> # simple set of 18 data points:
> Rail
Grouped Data: travel ~ 1 | Rail
   Rail travel
1     1     55
2     1     53
3     1     54
4     2     26
5     2     37
6     2     32
7     3     78
8     3     91
9     3     85
10    4     92
11    4    100
12    4     96
13    5     49
14    5     51
15    5     50
16    6     80
17    6     85
18    6     83
> 
> fm1 <- lme(travel~1, data=Rail, random=~1|Rail, method="ML")
> summary(fm1)
Linear mixed-effects model fit by maximum likelihood
 Data: Rail 
       AIC      BIC    logLik
  134.5600 137.2312 -64.28002

Random effects:
 Formula: ~1 | Rail
        (Intercept) Residual
StdDev:    22.62435 4.020779

Fixed effects: travel ~ 1 
            Value Std.Error DF  t-value p-value
(Intercept)  66.5  9.554026 12 6.960416  <.0001

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-1.61098123 -0.28887045  0.03454167  0.21372781  1.62222280 

Number of Observations: 18
Number of Groups: 6 
> 
> # Agrees with book.  
> # lme's est of "within-group" sigma is labelled "Residual" 4.0208
> # lme's est of "between-group" sigma_b is StdDev of Intercept 22.624
> 
> 
> # This should be the Maximum Likelihood Estimate (which ignores the
> # reduction of the DF by the # of fit parameters) of the std. dev. of the
> # within-group residual: 
> (sum(fm1$residuals[,2]**2)/length(fm1$residuals[,2]) )**0.5
[1] 3.291492
> # but this gives only 3.29, compared to lme's sigma=4.0208.
> 
> # Apparently lme is including a correction for the number of Degrees of
> # Freedom, reduced from the # of observations by 6 (the # of groups):
> 
> (sum(fm1$residuals[,2]**2)/(length(fm1$residuals[,2])-6) )**0.5
[1] 4.031238
> # (although this is still 0.3% low...)
> 
> # This should be the ML est. of the std. dev. of the random effect b_i
> # coefficients:
> 
> (sum(ranef(fm1)**2)/length(ranef(fm1)[,]) )**0.5
[1] 22.50619
> # but it is 0.5% low from lme's sigma_b=22.62435.
> 
> # Random-effects Variance matrix is a single number in this case:
> VarCorr(fm1)
Rail = pdLogChol(1) 
            Variance  StdDev   
(Intercept) 511.86112 22.624348
Residual     16.16667  4.020779
> 
> # Taking lme's estimate of the within-group sigma and between-group
> # sigma_b, the uncertainty in the mean of the data should be:
> 
> (fm1$sigma^2/18+as.real(VarCorr(fm1)[1,1])/6)^0.5
[1] 9.284844
> # though this is 3% lower than lme's reported Std.Error of the intercept
> # of 9.554026:
> (fm1$sigma^2/18+as.real(VarCorr(fm1)[1,1])/6)^0.5/9.554026
[1] 0.9718253
> 
> intervals(fm1)
Approximate 95% confidence intervals

 Fixed effects:
               lower est.    upper
(Intercept) 46.27006 66.5 86.72994

 Random Effects:
  Level: Rail 
                   lower     est.    upper
sd((Intercept)) 12.77211 22.62435 40.07648

 Within-group standard error:
   lower     est.    upper 
2.695008 4.020779 5.998746 
> 
> # The 95% confidence interval for beta (the intercept), is very close to
> # what one expects with sigma_beta=9.554026, with a t value of 2.110 for
> # DF=N-1=17, though it is actually closer to the t value of 2.120 for
> # DF=16:
> (66.5-46.27006)/9.554026
[1] 2.117426
> qt(0.975,df=16)  # t-value for 95% confidence interval and df=16
[1] 2.119905
> qt(0.975,df=6)   # t-value for 95% confidence interval and df=6
[1] 2.446912
> 
> # However, since the uncertainty in beta is dominated by the
> # between-group errors, shouldn't the DF used for the t value
> # be close to the number of groups, 6, not the number of 
> # observations, 18?
>