[R-sig-ME] Random effects correlations and factor reference levels
Christopher Knight
Chris.Knight at manchester.ac.uk
Tue Oct 13 15:57:32 CEST 2015
I am fitting mixed effects models where there is a random effect which acts differently on each level of a factor, much as in the Machines example from the nlme package:
library(nlme)
> M1 <- lme(fixed = score ~ Machine, random = ~Machine|Worker, data=Machines)
> VarCorr(M1)
Worker = pdLogChol(Machine)
Variance StdDev Corr
(Intercept) 16.6405306 4.0792806 (Intr) MachnB
MachineB 34.5466908 5.8776433 0.484
MachineC 13.6150244 3.6898543 -0.365 0.297
Residual 0.9246296 0.9615766
I was interpreting these correlations to mean that the estimate of workers’ scores on Machine B (or more precisely their BLUPS, interpreted as something like their residuals from the average on Machine B) are weakly positively correlated with their scores on Machine C (r=0.297). Because Machine A is the reference level of the factor ‘Machine’ (with treatment contrasts), I was also interpreting the ‘(Intercept)’ to represent Machine A, i.e. that the scores of the workers on Machine A are somewhat positively correlated with their scores on Machine B (r=0.484), but negatively with their scores on Machine C (r=-0.365).
If this were true, I would expect these conclusions to be robust to exactly which machine is the reference level in the contrasts for the Machine factor, but apparently they are not:
> Machines$Machine <- relevel(Machines$Machine, ref = "B")
> M2 <- update(M1)
> VarCorr(M2)
Worker = pdLogChol(Machine)
Variance StdDev Corr
(Intercept) 74.3956398 8.6252907 (Intr) MachnA
MachineA 34.5466902 5.8776433 -0.910
MachineC 35.2950235 5.9409615 -0.882 0.805
Residual 0.9246296 0.9615766
What is going on here/what am I misinterpreting?
As expected, these two models appear to be equivalent by anova(M1,M2), yet the correlation among workers between machines A and B appears to be strongly negative in the second case (r = -0.91), where it seems moderately positive in the first case (r=0.484) and intervals() on the two models suggests no overlap of (admittedly pretty broad) confidence intervals in this case. Add to that the differences in the estimated variance components/standard deviations – each model has an identical residual, but very different individual values (that for Machine C going from 13.6 to 35) and totals; at the same time the estimate for the variance among workers on Machine B in model M1 (34.5466908) is suspiciously similar (albeit not quite identical) to the estimate for variance among workers on Machine A in model M2 (34.5466902).
The values are pretty much the same fitting with lmer (though in practice I need the correlation and weights arguments available with lme), which makes it feel like a problem with my interpretation. Or is it really just a case of too little data to resolve such ±subtle parameters clearly?
Dropping the fixed effect doesn’t make things much better and, on the models I am running in anger, there is more data and the differences are even more pronounced. It looks a bit like this issue: http://stats.stackexchange.com/questions/82102/changing-reference-level-for-contrasts-changes-results-in-r-3-0-2-lme4-1-1-2-vs
which was framed in terms of versions but, given that I get very similar issues using nlme and lme4, it doesn’t seem likely to be a version issue here (albeit I don’t have old versions immediately to hand to test this on).
Any insights much appreciated,
Chris
running nlme_3.1-122 in R 3.2.2 on OS X 10.10.5
------------------------------------------------------------------------
Dr Christopher Knight Michael Smith Building
Lecturer Faculty of Life Sciences
Tel: +44 (0)161 2755378 The University of Manchester
room B.2012 Oxford Road
tinyurl.com/knightFLS/<http://tinyurl.com/knightFLS/> Manchester M13 9PT
· . ,,><(((°> UK
------------------------------------------------------------------------
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list