[R] standardized residuals (random effects) using nlme and ranef

Mon Jul 31 18:53:13 CEST 2006

I have a cut-and-paste error in this message that I sent a few minutes
ago. I show the evaluation of rr as
> rr <- ranef(model.4b, postVar = TRUE)
when it was
> rr <- ranef(fm1, postVar = TRUE)

I also omitted the evaluation of rr1, which is

rr1 <- rr$class


On 7/31/06, Douglas Bates <bates at stat.wisc.edu> wrote:
> Thank you for providing the reproducible example and the explanation
> of what you are seeking to calculate.
>
> On 7/31/06, Dirk Enzmann <dirk.enzmann at uni-hamburg.de> wrote:
> > As suggested I try another post.
> >
> > First I give a reproducible example. The example data set has been
> > provided by I. Plewis (1997), Statistics in Education. London: Arnold (I
> > include the residuals obtained by MLWin):
> >
> > # ---------------------------------------------
> >
> > library(nlme)
> >
> > # Example data from Plewis
> >
> > BaseURL="http://www.ottersbek.de/"
> > ds32 = read.fwf(paste(BaseURL,'ds32.dat',sep=''),
> >                        widths=c(9,10,10,10,10,10),
> >                        col.names=c('class','pupil','zcurric',
> >                                    'curric','zmath1','math2'),
> >                        colClasses=c('factor','factor','numeric',
> >                                     'numeric','numeric','numeric'))
> >
> > # Random slopes model (p. 44):
> > model.4b = lme(math2~zcurric,random=~zcurric|class,method='ML',data=ds32)
> >
> > # Random effects (intercept and slope residuals) of level 1 (class):
> > RandEff = ranef(model.4b,level=1)
> >
> > # Results of MLWin (c300 = intercept residuals of level "class",
> > #                   c301 = slope residuals of level "class",
> > #                   c304 = standardized intercept residuals,
> > #                   c305 = standardized slope residuals):
> > MLWinRes = read.fwf(paste(BaseURL,'resid_ex32.txt',sep=''),
> >                      widths=c(15,15,15,15),
> >                      col.names=c('c300','c301','c304','c305'),
> >                      colClasses=c('numeric','numeric','numeric','numeric'))
> >
> > # They are identical to the results of MLWin (c300, c301):
> > round(cbind(RandEff,MLWinRes[,1:2]),5)
> >
> > # However, using option "standard" the results differ considerably:
> > round(cbind(ranef(model.4b,level=1,standard=T),MLWinRes[,3:4]),5)
>
> Your summation below of the cause of this inconsistency is correct.
> As Harold explains the "standardized" random effects returned by ranef
> in the nlme package are the "estimates" (actually the "predictors") of
> the random effects divided by the estimated standard deviation of
> those random effects, not by the estimated standard error as stated in
> the documentation.  I will correct the documentation.
>
> That is, "standardized random effects" in the nlme package are
> produced by dividing all the intercept random effects by the same
> number (2.9723) and all the slope random effects by 2.0014.
> > rr1 <- ranef(model.4b)
> > rr2 <- ranef(model.4b, standard = TRUE)
> > rr1/rr2
>           (Intercept)  zcurric
>         8    2.972373 2.001491
>        12    2.972373 2.001491
>        17    2.972373 2.001491
>        22    2.972373 2.001491
>        23    2.972373 2.001491
>        28    2.972373 2.001491
>        29    2.972373 2.001491
>        30    2.972373 2.001491
>        31    2.972373 2.001491
>        32    2.972373 2.001491
>        33    2.972373 2.001491
>        34    2.972373 2.001491
>        35    2.972373 2.001491
>        36    2.972373 2.001491
>        37    2.972373 2.001491
>        38    2.972373 2.001491
>        39    2.972373 2.001491
>        41    2.972373 2.001491
>        42    2.972373 2.001491
>        43    2.972373 2.001491
>        44    2.972373 2.001491
>        45    2.972373 2.001491
>        46    2.972373 2.001491
>        47    2.972373 2.001491
>
> Another way of defining a standardization would be to use what Harold
> Doran and others call the "posterior variance-covariance" of the
> random effects.  On reflection I think I would prefer the term
> "conditional variance" but I use "posterior" below.
>
> Strictly speaking the random effects are not parameters in the model -
> they are unobserved random variables.  Conditional on the values of
> the parameters in the model and on the observed data, the random
> effects are normally distributed with a mean and variance-covariance
> that can be calculated.
>
> Influenced by Harold I added an optional argument "postVar" to the
> ranef extractor function for lmer models (but apparently I failed to
> document it - I will correct that).  If you fit your model with lmer,
> as you do below, you can standardize the  random effects according to
> the conditional variances by
>
> > fm1 <- lmer(math2 ~ zcurric + (zcurric|class), ds32, method='ML')
> > rr <- ranef(model.4b, postVar = TRUE)
> > str(rr)  # shows the structure of the object rr
> Formal class 'ranef.lmer' [package "Matrix"] with 1 slots
>   ..@ .Data:List of 1
>   .. ..$ :`data.frame': 24 obs. of  2 variables:
>   .. .. ..$ (Intercept): num [1:24]  0.860 -1.962 -1.105  4.631 -0.781 ...
>   .. .. ..$ zcurric    : num [1:24]  0.2748 -1.3194  0.0841  0.5958  0.8080 ...
>   .. .. ..- attr(*, "postVar")= num [1:2, 1:2, 1:24]  2.562 -0.585
> -0.585  1.837  3.027 ...
>
> The two sets of conditional standard deviations are
>
> > sqrt(attr(rr1, "postVar")[1,1,])
>  [1] 1.600589 1.739768 1.466261 1.542172 1.942629 1.506112 1.892815 1.665414
>  [9] 1.377917 1.574244 1.755679 2.039292 1.887651 1.451916 1.732030 1.659786
> [17] 1.987397 1.795273 1.542049 1.784441 1.629663 1.753223 1.585520 1.470161
> > sqrt(attr(rr1, "postVar")[2,2,])
>  [1] 1.355537 1.630075 1.506560 1.378282 1.790906 1.310406 1.341185 1.512708
>  [9] 1.397492 1.808535 0.859909 1.867273 1.626857 1.669984 1.520560 1.489380
> [17] 1.989123 1.326610 1.613399 1.074365 1.764128 1.455202 1.489311 1.942117
>
> However, as you have seen, standardizing the conditional means by
> these conditional standard deviations still does not reproduce the
> results from MLWin.
>
> If you can determine what is being calculated by MLWin or if you can
> determine that there is a bug in the lmer calculations I would be
> delighted to hear about it.
>
> Thanks again for a very thorough report and for the reproducible example.
>
> > # ---------------------------------------------
> >
> >  From the RSiteSearch("MLWin") there are two hits that seem to be
> > relevant: One of Douglas Bates
> >
> > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/53097.html
> >
> > where he explains why he regards getting standard errors of the
> > estimates of variance components as not being important. I think (but
> > I'm not sure) that this implies that I should not use standardized
> > residuals, as well.
> >
> > Secondly a post of Roel de Jong
> >
> > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/62280.html
> >
> > However, I can't figure out how I could make use of his suggestion to
> > obtain the standardized residuals I am looking for.
> >
> > A part of the answer has been given in an answer to my previous post by
> > Harold Doran. He showed that the so called "standardized residuals"
> > obtained by ranef() using the option "standard=T" are the residuals
> > devided by the standard deviation of the random effects, not divided by
> > the "corresponding standard error" as stated in the ranef() help - if I
> > understood him correctly.
> >
> > In the multilevel list he also showed how to create a caterpillar plot
> > using lmer() and the errbar() function of Hmisc (I modified his example
> > to adapt it to my data):
> >
> > # ------------------------------------
> >
> > detach(package:nlme)
> > library(Matrix)
> > library(Hmisc)
> >
> > # Fit a sample model:
> > fm1 = lmer(math2 ~ zcurric + (zcurric|class), ds32, method='ML',
> > control=list(gradient=FALSE, niterEM=0))
> >
> > .Call("mer_gradComp", fm1, PACKAGE = "Matrix")
> >
> > # extract random effects:
> > fm1.blup = ranef(fm1)$class
> >
> > #extract variances and multiple by scale factor:
> > fm1.post.var = fm1 at bVar$class * attr(VarCorr(fm1),"sc")**2
> >
> > # posterior variance of slope on fm1:
> > fm1.post.slo = sqrt(fm1.post.var[2,2,])
> >
> > # Slopes:
> > x = fm1.blup[,2]+outer(fm1.post.slo, c(-1.96,0,1.96))
> >
> > # This code creates a catepillar plot using the errbar function:
> > x <- x[order(x[,2]) , ]
> > print(errbar(1:dim(x)[1], x[,2],x[,3],x[,1],ylab='Slopes', xlab='Classes'))
> > abline(h=0)
> >
> > # ------------------------------------
> >
> > Is it correct that I can obtain a respecive plot for the intercepts in a
> > similar way?
> >
> > # ------------------------------------
> >
> > # posterior variance of intercept on fm1.post.int:
> > fm1.post.int = sqrt(fm1.post.var[1,1,])
> >
> > # Intercepts:
> > x = fm1.blup[,1]+outer(fm1.post.int, c(-1.96,0,1.96))
> >
> > # This code creates a catepillar plot using the errbar function
> > x <- x[order(x[,2]) , ]
> > print(errbar(1:dim(x)[1], x[,2],x[,3],x[,1],ylab='Intercepts',
> > xlab='Classes'))
> > abline(h=0)
> >
> > # ------------------------------------
> >
> > To sum up, I can't figure out how MLWin calculates the standardized
> > residuals. But I understand that this is not a question for the R list.
> > Nevertheless, it would help if someone could point me to some arguments
> > why not to use them and stick to the results obtainable by ranef().
> >
> > Spencer Graves wrote:
> >
> > >       Have you tried RSiteSearch("MLWin")?  I just got 29 hits.  I
> > > wonder if any one of these might relate to your question?
> > >
> > >       If you would like more help on this issue from this listserve,
> > > please submit another post, preferably illustrating your question with
> > > the simplest possible self-contained example that illustrates your
> > > question, perhaps like the following:
> > >
> > >       fm1.16 <- lme(distance~age, data=Orthodont[1:16,],
> > >            random=~age|Subject)
> > >
> > >       Hope this helps.
> > >       Spencer Graves
> > > p.s.  PLEASE do read the posting guide
> > > "www.R-project.org/posting-guide.html" and provide commented, minimal,
> > > self-contained, reproducible code.
> > >
> > > Dirk Enzmann wrote:
> > >
> > >> Using ranef() (package nlme, version 3.1-75) with an 'lme' object I
> > >> can obtain random effects for intercept and slope of a certain level
> > >> (say: 1) - this corresponds to (say level 1) "residuals" in MLWin.
> > >> Maybe I'm mistaken here, but the results are identical.
> > >>
> > >> However, if I try to get the standardized random effects adding the
> > >> paramter "standard=T" to the specification of ranef(), the results
> > >> differ considerably from the results of MLWin (although MLWin defines
> > >> "standardized" in the same way as "divided by its estimated
> > >> (diagnostic) standard error").
> > >>
> > >> Why do the results differ although the estimates (random effects and
> > >> thus their variances) are almost identical? I noticed that lme() does
> > >> not compute the standard errors of the variances of the random effects
> > >> - for several reasons, but if this is true, how does ranef() calculate
> > >> the standardized random effects (the help says: '"standardized" (i.e.
> > >> divided by the corresponding estimated standard error)').
> > >>
> > >> Is there a way to obtain similar results as in MLWin (or: should I
> > >> prefer the results of ranef() for certain reasons)?
> > >>
> > >> Dirk
> > >>
> > >> -----------------------------
> > >> R version: 2.3.1 Patched (2006-06-21 r38367)
> >
> > --
> > *************************************************
> > Dr. Dirk Enzmann
> > Institute of Criminal Sciences
> > Dept. of Criminology
> > Edmund-Siemers-Allee 1
> > D-20146 Hamburg
> > Germany
> >
> > phone: +49-(0)40-42838.7498 (office)
> >         +49-(0)40-42838.4591 (Billon)
> > fax:   +49-(0)40-42838.2344
> > email: dirk.enzmann at uni-hamburg.de
> > www:
> > http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>