[R-sig-ME] correlation of random intercepts and slopes

Mon Jun 20 21:27:09 CEST 2011

The seemingly high correlation from lmer is between the fixed effects of slope and intercept rather than a correlation between the random effects associated with slope and intercept.

You'll see this in a simple regression with the correlation between slope and intercept being

           - xbar * sqrt(n/sum(x_i^2))

even when one isn't fitting a random coefficients model (Draper and Smith, Applied Regression Analysis, 3rd ed., page 129)

Jim

Jim Baldwin
Station Statistician
Pacific Southwest Research Station
USDA Forest Service
Albany, CA  94710

-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Titus von der Malsburg
Sent: Monday, June 20, 2011 11:56 AM
To: R-sig-mixed-models at r-project.org
Subject: [R-sig-ME] correlation of random intercepts and slopes

Dear list!

I have a situation where lmer reports much higher correlations of random slopes and intercepts than I would expect.  I generated fake data for a fictitious experiment using the function new.df shown below.  The experiment has a within-factor 'cond' with levels 1 and 2, a set of items and subjects which are both measured repeatedly.  There are random intercepts and slopes for items and subjects.  On top there's sampling error.  The dependent measure 'rt' is some kind of reaction time.  Example:

  > d <- new.df(cond1.rt=600, effect.size=10, sdev=10,
                nsubj=49, sdev.int.subj=10, sdev.slp.subj=10,
                nitems=16, sdev.int.items=10,
                sdev.slp.items=10)

cond1.rt is the average reaction time in condition 1.
cond1.rt+effect.size is the rt in condition 2, sdev the sampling error, the rest concerns number of subjects and items, and the sdevs of their random intercepts and slopes.
This is the head of the data frame:

  > head(d)
    subj item cond       rt re.int.subj re.slp.subj re.int.item re.slp.item
  1    1    1    1 628.4764    12.15373    4.138659   -0.932447    6.619795
  2    1    1    2 603.1140    12.15373   -4.138659   -0.932447   -6.619795
  3    1    2    1 617.8467    12.15373    4.138659   -2.980553    2.471397
  4    1    2    2 606.5777    12.15373   -4.138659   -2.980553   -2.471397
  5    1    3    1 622.8397    12.15373    4.138659   -2.133049    5.101761
  6    1    3    2 616.4011    12.15373   -4.138659   -2.133049   -5.101761

The random intercepts and slopes have small correlations:

  > with(subset(d, cond==1), cor(re.int.subj, re.slp.subj))
  [1] 0.2650591
  > with(subset(d, cond==1), cor(re.int.item, re.slp.item))
  [1] -0.3144838

But when I roll an lmer model, I see much higher correlations of slopes and intercepts:

  > lmer(rt~cond+(cond|item)+(cond|subj), d)
  Linear mixed model fit by REML
  Formula: rt ~ cond + (cond | item) + (cond | subj)
     Data: d
     AIC   BIC logLik deviance REMLdev
   12050 12098  -6016    12040   12032
  Random effects:
   Groups   Name        Variance Std.Dev. Corr
   subj     (Intercept) 349.735  18.7012
            cond         82.905   9.1052  -0.839
   item     (Intercept) 201.250  14.1863
            cond         81.835   9.0463  -0.732
   Residual              98.758   9.9377
  Number of obs: 1568, groups: subj, 49; item, 16

  Fixed effects:
              Estimate Std. Error t value
  (Intercept)  594.053      4.511  131.70
  cond           8.120      2.657    3.06

  Correlation of Fixed Effects:
       (Intr)
  cond -0.765

When I xyplot the data, I can't see any correlation of slopes and intercepts.  Could anybody please enlighten me as to why lmer reports these high correlations?

Many thanks!

  Titus

  new.df <- function(cond1.rt=600, effect.size=10, sdev=10,
                     sdev.int.subj=10, sdev.slp.subj=10, nsubj=10,
                     sdev.int.items=10, sdev.slp.items=10, nitems=10) {

    ncond <- 2

    subj <- rep(1:nsubj, each=nitems*ncond)
    item <- rep(1:nitems, nsubj, each=ncond)
    cond <- rep(0:1, nsubj*nitems)
    err  <- rnorm(nsubj*nitems*ncond, 0, sdev)
    d    <- data.frame(subj=subj, item=item, cond=cond+1, err=err)

    # Adding random intercepts and slopes for subjects:
    re.int.subj   <- rnorm(nsubj, 0, sdev.int.subj)
    d$re.int.subj <- rep(re.int.subj, each=nitems*ncond)
    re.slp.subj   <- rnorm(nsubj, 0, sdev.slp.subj)
    d$re.slp.subj <- rep(re.slp.subj, each=nitems*ncond) * (cond - 0.5)

    # Adding random intercepts and slopes for items:
    re.int.item   <- rnorm(nitems, 0, sdev.int.items)
    d$re.int.item <- rep(re.int.item, nsubj, each=ncond)
    re.slp.item   <- rnorm(nitems, 0, sdev.int.items)
    d$re.slp.item <- rep(re.slp.item, nsubj, each=ncond) * (cond - 0.5)

    d$rt <- (cond1.rt + cond*effect.size
             + d$re.int.subj + d$re.slp.subj
             + d$re.int.item + d$re.slp.item
             + d$err)

    d
  }

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models