[R] Question to NLME, ML vs. REML

Sun Sep 5 18:29:15 CEST 2004

    Have you read Pinheiro and Bates (2000) Mixed-Effects Models in S 
and S-Plus (Springer)?  I learned a lot from this book and recommend it 
very highly. 

      Probably the most obvious difference is that ML estimates of 
variance components are biased.  The bias is equivalent to estimating 
the variance of a population by dividing the sum of squared deviations 
from the sample mean of N observations by N rather than N-1. 

      To check for other differences, I just did library(mle4) in R 
1.9.1 and ran the example in help("lme") then ran the same lme call with 
method="ML".  The results (see below) show that the parameter estimates 
for the fixed effects are the same for ML as for REML, but the estimates 
of standard errors of those parameters seems to perpetuate the bias 
corrected by REML. 

      If stepAIC works with ML but not with REML, I would use stepAIC 
with ML then reestimate the final model with REML.  However, I would 
also do some simulations to estimate the false positive (Type I) error 
rate.  AIC includes a penalty for the complexity of the model but NOT 
for the number of alternatives considered, and if you present enough 
random alternatives to the procedure, it will find some that are 
statistically significant.  The nlme package contains a function 
"simulate.lme" to facilitate this. 

      hope this helps.  spencer graves

 > library(lme4)
 >      data(bdf)
 >      fm <- lme(langPOST ~ IQ.ver.cen + avg.IQ.ver.cen, data = bdf,
+                random = ~ IQ.ver.cen | schoolNR)
 >      summary(fm)
Linear mixed-effects model fit by REML
Fixed: langPOST ~ IQ.ver.cen + avg.IQ.ver.cen
 Data: bdf
      AIC      BIC    logLik
 15231.87 15272.02 -7608.937

Random effects:
 Groups   Name        Variance Std.Dev. Corr  
 schoolNR (Intercept)  8.07563 2.84177        
          IQ.ver.cen   0.20801 0.45608  -0.642
 Residual             41.34968 6.43037        
# of obs: 2287, groups: schoolNR, 131

Fixed effects:
                 Estimate Std. Error   DF t value  Pr(>|t|)   
(Intercept)    4.0750e+01 2.8808e-01 2284 141.452 < 2.2e-16 ***
IQ.ver.cen     2.4598e+00 8.3638e-02 2284  29.410 < 2.2e-16 ***
avg.IQ.ver.cen 1.4089e+00 3.2374e-01 2284   4.352 1.409e-05 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

Correlation of Fixed Effects:
            (Intr) IQ.vr.
IQ.ver.cen  -0.274      
avg.IQ.vr.c  0.029 -0.213
 >      fm.ml <- lme(langPOST ~ IQ.ver.cen + avg.IQ.ver.cen, data = bdf,
+                random = ~ IQ.ver.cen | schoolNR, method="ML")
 >      summary(fm.ml)
Linear mixed-effects model fit by maximum likelihood
Fixed: langPOST ~ IQ.ver.cen + avg.IQ.ver.cen
 Data: bdf
      AIC      BIC    logLik
 15227.53 15267.68 -7606.767

Random effects:
 Groups   Name        Variance Std.Dev. Corr  
 schoolNR (Intercept)  7.91904 2.81408        
          IQ.ver.cen   0.20006 0.44728  -0.652
 Residual             41.35055 6.43044        
# of obs: 2287, groups: schoolNR, 131

Fixed effects:
                 Estimate Std. Error   DF  t value  Pr(>|t|)   
(Intercept)    4.0750e+01 2.8610e-01 2284 142.4342 < 2.2e-16 ***
IQ.ver.cen     2.4589e+00 8.3237e-02 2284  29.5406 < 2.2e-16 ***
avg.IQ.ver.cen 1.4052e+00 3.2168e-01 2284   4.3683 1.308e-05 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

Correlation of Fixed Effects:
            (Intr) IQ.vr.
IQ.ver.cen  -0.274      
avg.IQ.vr.c  0.028 -0.214
#######################
Klaus Thul wrote:

>Dear all,
>
>I am planning to use nlme library for analysis of experiments in semiconductor 
>industry. Currently I am using "lm" but plan to move to "lme" to handle 
>within wafer / wafer-to-wafer and lot-to-lot variation correctly. 
>
>So far everything is working well, but I have a fundamentel question: 
>
>NLME offers "maximum likelihood" and "restricted maximum likelihood" for 
>parameter estimation. ML has the advantage, that likelihood ratios can be 
>computed even with changes in model structure. In addition, ML works with the 
>stepAIC function from MASS-library which I am currently using for 
>model-building.
>
>I am wondering, why REML is the default setting in NLME and therefore somehow 
>preferred by the authors. What is the main reason to use REML? 
>
>Maybe I am lacking here statistical knowledge. Any hint, including reference 
>to literature would be very helpful.
>
>Best regards,
>Klaus Thul
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>  
>

-- 
Spencer Graves, PhD, Senior Development Engineer
O:  (408)938-4420;  mobile:  (408)655-4567