[R] standardized residuals (random effects) using nlme and ranef
Douglas Bates
bates at stat.wisc.edu
Mon Jul 31 18:53:13 CEST 2006
I have a cut-and-paste error in this message that I sent a few minutes
ago. I show the evaluation of rr as
> rr <- ranef(model.4b, postVar = TRUE)
when it was
> rr <- ranef(fm1, postVar = TRUE)
I also omitted the evaluation of rr1, which is
rr1 <- rr$class
On 7/31/06, Douglas Bates <bates at stat.wisc.edu> wrote:
> Thank you for providing the reproducible example and the explanation
> of what you are seeking to calculate.
>
> On 7/31/06, Dirk Enzmann <dirk.enzmann at uni-hamburg.de> wrote:
> > As suggested I try another post.
> >
> > First I give a reproducible example. The example data set has been
> > provided by I. Plewis (1997), Statistics in Education. London: Arnold (I
> > include the residuals obtained by MLWin):
> >
> > # ---------------------------------------------
> >
> > library(nlme)
> >
> > # Example data from Plewis
> >
> > BaseURL="http://www.ottersbek.de/"
> > ds32 = read.fwf(paste(BaseURL,'ds32.dat',sep=''),
> > widths=c(9,10,10,10,10,10),
> > col.names=c('class','pupil','zcurric',
> > 'curric','zmath1','math2'),
> > colClasses=c('factor','factor','numeric',
> > 'numeric','numeric','numeric'))
> >
> > # Random slopes model (p. 44):
> > model.4b = lme(math2~zcurric,random=~zcurric|class,method='ML',data=ds32)
> >
> > # Random effects (intercept and slope residuals) of level 1 (class):
> > RandEff = ranef(model.4b,level=1)
> >
> > # Results of MLWin (c300 = intercept residuals of level "class",
> > # c301 = slope residuals of level "class",
> > # c304 = standardized intercept residuals,
> > # c305 = standardized slope residuals):
> > MLWinRes = read.fwf(paste(BaseURL,'resid_ex32.txt',sep=''),
> > widths=c(15,15,15,15),
> > col.names=c('c300','c301','c304','c305'),
> > colClasses=c('numeric','numeric','numeric','numeric'))
> >
> > # They are identical to the results of MLWin (c300, c301):
> > round(cbind(RandEff,MLWinRes[,1:2]),5)
> >
> > # However, using option "standard" the results differ considerably:
> > round(cbind(ranef(model.4b,level=1,standard=T),MLWinRes[,3:4]),5)
>
> Your summation below of the cause of this inconsistency is correct.
> As Harold explains the "standardized" random effects returned by ranef
> in the nlme package are the "estimates" (actually the "predictors") of
> the random effects divided by the estimated standard deviation of
> those random effects, not by the estimated standard error as stated in
> the documentation. I will correct the documentation.
>
> That is, "standardized random effects" in the nlme package are
> produced by dividing all the intercept random effects by the same
> number (2.9723) and all the slope random effects by 2.0014.
> > rr1 <- ranef(model.4b)
> > rr2 <- ranef(model.4b, standard = TRUE)
> > rr1/rr2
> (Intercept) zcurric
> 8 2.972373 2.001491
> 12 2.972373 2.001491
> 17 2.972373 2.001491
> 22 2.972373 2.001491
> 23 2.972373 2.001491
> 28 2.972373 2.001491
> 29 2.972373 2.001491
> 30 2.972373 2.001491
> 31 2.972373 2.001491
> 32 2.972373 2.001491
> 33 2.972373 2.001491
> 34 2.972373 2.001491
> 35 2.972373 2.001491
> 36 2.972373 2.001491
> 37 2.972373 2.001491
> 38 2.972373 2.001491
> 39 2.972373 2.001491
> 41 2.972373 2.001491
> 42 2.972373 2.001491
> 43 2.972373 2.001491
> 44 2.972373 2.001491
> 45 2.972373 2.001491
> 46 2.972373 2.001491
> 47 2.972373 2.001491
>
> Another way of defining a standardization would be to use what Harold
> Doran and others call the "posterior variance-covariance" of the
> random effects. On reflection I think I would prefer the term
> "conditional variance" but I use "posterior" below.
>
> Strictly speaking the random effects are not parameters in the model -
> they are unobserved random variables. Conditional on the values of
> the parameters in the model and on the observed data, the random
> effects are normally distributed with a mean and variance-covariance
> that can be calculated.
>
> Influenced by Harold I added an optional argument "postVar" to the
> ranef extractor function for lmer models (but apparently I failed to
> document it - I will correct that). If you fit your model with lmer,
> as you do below, you can standardize the random effects according to
> the conditional variances by
>
> > fm1 <- lmer(math2 ~ zcurric + (zcurric|class), ds32, method='ML')
> > rr <- ranef(model.4b, postVar = TRUE)
> > str(rr) # shows the structure of the object rr
> Formal class 'ranef.lmer' [package "Matrix"] with 1 slots
> ..@ .Data:List of 1
> .. ..$ :`data.frame': 24 obs. of 2 variables:
> .. .. ..$ (Intercept): num [1:24] 0.860 -1.962 -1.105 4.631 -0.781 ...
> .. .. ..$ zcurric : num [1:24] 0.2748 -1.3194 0.0841 0.5958 0.8080 ...
> .. .. ..- attr(*, "postVar")= num [1:2, 1:2, 1:24] 2.562 -0.585
> -0.585 1.837 3.027 ...
>
> The two sets of conditional standard deviations are
>
> > sqrt(attr(rr1, "postVar")[1,1,])
> [1] 1.600589 1.739768 1.466261 1.542172 1.942629 1.506112 1.892815 1.665414
> [9] 1.377917 1.574244 1.755679 2.039292 1.887651 1.451916 1.732030 1.659786
> [17] 1.987397 1.795273 1.542049 1.784441 1.629663 1.753223 1.585520 1.470161
> > sqrt(attr(rr1, "postVar")[2,2,])
> [1] 1.355537 1.630075 1.506560 1.378282 1.790906 1.310406 1.341185 1.512708
> [9] 1.397492 1.808535 0.859909 1.867273 1.626857 1.669984 1.520560 1.489380
> [17] 1.989123 1.326610 1.613399 1.074365 1.764128 1.455202 1.489311 1.942117
>
> However, as you have seen, standardizing the conditional means by
> these conditional standard deviations still does not reproduce the
> results from MLWin.
>
> If you can determine what is being calculated by MLWin or if you can
> determine that there is a bug in the lmer calculations I would be
> delighted to hear about it.
>
> Thanks again for a very thorough report and for the reproducible example.
>
> > # ---------------------------------------------
> >
> > From the RSiteSearch("MLWin") there are two hits that seem to be
> > relevant: One of Douglas Bates
> >
> > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/53097.html
> >
> > where he explains why he regards getting standard errors of the
> > estimates of variance components as not being important. I think (but
> > I'm not sure) that this implies that I should not use standardized
> > residuals, as well.
> >
> > Secondly a post of Roel de Jong
> >
> > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/62280.html
> >
> > However, I can't figure out how I could make use of his suggestion to
> > obtain the standardized residuals I am looking for.
> >
> > A part of the answer has been given in an answer to my previous post by
> > Harold Doran. He showed that the so called "standardized residuals"
> > obtained by ranef() using the option "standard=T" are the residuals
> > devided by the standard deviation of the random effects, not divided by
> > the "corresponding standard error" as stated in the ranef() help - if I
> > understood him correctly.
> >
> > In the multilevel list he also showed how to create a caterpillar plot
> > using lmer() and the errbar() function of Hmisc (I modified his example
> > to adapt it to my data):
> >
> > # ------------------------------------
> >
> > detach(package:nlme)
> > library(Matrix)
> > library(Hmisc)
> >
> > # Fit a sample model:
> > fm1 = lmer(math2 ~ zcurric + (zcurric|class), ds32, method='ML',
> > control=list(gradient=FALSE, niterEM=0))
> >
> > .Call("mer_gradComp", fm1, PACKAGE = "Matrix")
> >
> > # extract random effects:
> > fm1.blup = ranef(fm1)$class
> >
> > #extract variances and multiple by scale factor:
> > fm1.post.var = fm1 at bVar$class * attr(VarCorr(fm1),"sc")**2
> >
> > # posterior variance of slope on fm1:
> > fm1.post.slo = sqrt(fm1.post.var[2,2,])
> >
> > # Slopes:
> > x = fm1.blup[,2]+outer(fm1.post.slo, c(-1.96,0,1.96))
> >
> > # This code creates a catepillar plot using the errbar function:
> > x <- x[order(x[,2]) , ]
> > print(errbar(1:dim(x)[1], x[,2],x[,3],x[,1],ylab='Slopes', xlab='Classes'))
> > abline(h=0)
> >
> > # ------------------------------------
> >
> > Is it correct that I can obtain a respecive plot for the intercepts in a
> > similar way?
> >
> > # ------------------------------------
> >
> > # posterior variance of intercept on fm1.post.int:
> > fm1.post.int = sqrt(fm1.post.var[1,1,])
> >
> > # Intercepts:
> > x = fm1.blup[,1]+outer(fm1.post.int, c(-1.96,0,1.96))
> >
> > # This code creates a catepillar plot using the errbar function
> > x <- x[order(x[,2]) , ]
> > print(errbar(1:dim(x)[1], x[,2],x[,3],x[,1],ylab='Intercepts',
> > xlab='Classes'))
> > abline(h=0)
> >
> > # ------------------------------------
> >
> > To sum up, I can't figure out how MLWin calculates the standardized
> > residuals. But I understand that this is not a question for the R list.
> > Nevertheless, it would help if someone could point me to some arguments
> > why not to use them and stick to the results obtainable by ranef().
> >
> > Spencer Graves wrote:
> >
> > > Have you tried RSiteSearch("MLWin")? I just got 29 hits. I
> > > wonder if any one of these might relate to your question?
> > >
> > > If you would like more help on this issue from this listserve,
> > > please submit another post, preferably illustrating your question with
> > > the simplest possible self-contained example that illustrates your
> > > question, perhaps like the following:
> > >
> > > fm1.16 <- lme(distance~age, data=Orthodont[1:16,],
> > > random=~age|Subject)
> > >
> > > Hope this helps.
> > > Spencer Graves
> > > p.s. PLEASE do read the posting guide
> > > "www.R-project.org/posting-guide.html" and provide commented, minimal,
> > > self-contained, reproducible code.
> > >
> > > Dirk Enzmann wrote:
> > >
> > >> Using ranef() (package nlme, version 3.1-75) with an 'lme' object I
> > >> can obtain random effects for intercept and slope of a certain level
> > >> (say: 1) - this corresponds to (say level 1) "residuals" in MLWin.
> > >> Maybe I'm mistaken here, but the results are identical.
> > >>
> > >> However, if I try to get the standardized random effects adding the
> > >> paramter "standard=T" to the specification of ranef(), the results
> > >> differ considerably from the results of MLWin (although MLWin defines
> > >> "standardized" in the same way as "divided by its estimated
> > >> (diagnostic) standard error").
> > >>
> > >> Why do the results differ although the estimates (random effects and
> > >> thus their variances) are almost identical? I noticed that lme() does
> > >> not compute the standard errors of the variances of the random effects
> > >> - for several reasons, but if this is true, how does ranef() calculate
> > >> the standardized random effects (the help says: '"standardized" (i.e.
> > >> divided by the corresponding estimated standard error)').
> > >>
> > >> Is there a way to obtain similar results as in MLWin (or: should I
> > >> prefer the results of ranef() for certain reasons)?
> > >>
> > >> Dirk
> > >>
> > >> -----------------------------
> > >> R version: 2.3.1 Patched (2006-06-21 r38367)
> >
> > --
> > *************************************************
> > Dr. Dirk Enzmann
> > Institute of Criminal Sciences
> > Dept. of Criminology
> > Edmund-Siemers-Allee 1
> > D-20146 Hamburg
> > Germany
> >
> > phone: +49-(0)40-42838.7498 (office)
> > +49-(0)40-42838.4591 (Billon)
> > fax: +49-(0)40-42838.2344
> > email: dirk.enzmann at uni-hamburg.de
> > www:
> > http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
More information about the R-help
mailing list