[R-sig-ME] very basic HLM question
bates at stat.wisc.edu
Mon Feb 7 19:33:15 CET 2011
2011/2/7 Sebastián Daza <sebastian.daza at gmail.com>:
> Thank you Douglas.
> Do you know if there is a way to get the between-component variance (in my
> example equal to 59.004) through random effects Anova using R?
Why do you think the between-component variance is 59.004? You are
not, as far as I can see, estimating the between-component variance.
It's not the same as the mean square, even for balanced data.
> I just want
> to get the ICC using the simple formula: ICC = variance between macro units
> / total variance.
> On 2/7/2011 11:44 AM, Douglas Bates wrote:
>> 2011/2/7 Sebastián Daza<sebastian.daza at gmail.com>:
>>> Hi everyone,
>>> I need to get a between-component variance (e.g. random effects Anova) in
>>> order to compute by hand the intraclass correlation. However, using lmer
>>> don't get the same results (variance component) than using random effects
>>> Anova (with SPSS). I am using a database of students, clustered on
>>> (there is not the same number of students by school).
>> I think the issue is that the estimates of the variance components are
>> different, which would then lead to different estimates of the ICC.
>> You mentioned in your original posting that the data are unbalanced
>> (i.e. there are different numbers of students in different schools).
>> The estimates returned from lmer are the REML estimates in this case.
>> Do you know how the other estimates are being calculated? I'm not
>> sure that the naive calculation of within- and between- sums of
>> squares and equating expected mean squares with observed mean squares
>> works with unbalanced data.
>>> First, I get the intraclass using ICC1 in R:
>> As Paul indicated, it is more informative to say, "using ICC1 from the
>> multilevel package for R". ICC1 is not part of the base R nor the
>> required packages.
>>> summary(anova1<- aov(math ~ as.factor(schoolid), data=nels88))
>>>  0.4414491
>> That is using the raw mean squares, not the estimates of the variance
>> components. I have, thankfully, forgotten most of what I know about
>> expected and observed mean squares but I am pretty sure that those
>> don't correspond to the REML estimates, even in the balanced case.
>> Consider the results from the Dyestuff data, which is part of the lme4
>> package and is balanced
>>> print(fm1<- lmer(Yield ~ 1|Batch, Dyestuff))
>> Linear mixed model fit by REML ['merMod']
>> Formula: Yield ~ 1 | Batch
>> Data: Dyestuff
>> REML criterion at convergence: 319.6543
>> Random effects:
>> Groups Name Variance Std.Dev.
>> Batch (Intercept) 1764 42.00
>> Residual 2451 49.51
>> Number of obs: 30, groups: Batch, 6
>> Fixed effects:
>> Estimate Std. Error t value
>> (Intercept) 1527.50 19.38 78.8
>> (Intercept) 1764.05
>> (Intercept) 1
>>  49.5101
>>> vc<- VarCorr(fm1)
>>> var1hat<- as.vector(vc$Batch)
>>  1764.05
>>> varhat<- attr(vc, "sc")^2
>>  2451.25
>>> var1hat/(var1hat + varhat)
>>  0.4184874
>>> anova1<- aov(Yield ~ 1|Batch, Dyestuff)
>> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
>> 0 (non-NA) cases
>> In addition: Warning message:
>> In Ops.factor(1, Batch) : | not meaningful for factors
>>> anova1<- aov(Yield ~ Batch, Dyestuff)
>> Df Sum Sq Mean Sq F value Pr(>F)
>> Batch 5 56358 11271.5 4.5983 0.004398
>> Residuals 24 58830 2451.2
>> You can see that the estimates of the residual variance are the same
>> for the two model fits but the mean square for batch is not the
>> estimate of the between-batch variance.
>>> ICC1 comes from the multilevel package. Using lmer I get:
>>> mod1<- lmer(math ~ 1 + (1|schoolid), data=nels88)
>>> Linear mixed model fit by REML
>>> Formula: math ~ 1 + (1 | schoolid)
>>> Data: nels88
>>> AIC BIC logLik deviance REMLdev
>>> 1878 1888 -935.8 1875 1872
>>> Random effects:
>>> Groups Name Variance Std.Dev.
>>> schoolid (Intercept) 34.011 5.8319
>>> Residual 72.256 8.5003
>>> Number of obs: 260, groups: schoolid, 10
>>> Fixed effects:
>>> Estimate Std. Error t value
>>> (Intercept) 48.861 1.927 25.36
>>> The random intercept effect is 34.011. If I compute the ICC manually
>>> (according to some articles I have read) I get:
>>>  0.3200523
>>> According to my Anova analysis, the between-component variance should be
>>> 59.004. If I use 59.004 the formula works relatively well:
>>>  0.449520037
>>> Not equal, but at least similar in comparison with ICC from ICC1 command.
>>> Does anyone know why I have got those differences? How can I get the
>>> figure using R? I should get practically an identical result than using
>>> but I don't. I have used the database of that article and I have got an
>>> identical result between ICCs. But with the dataset I am using (nels88)
>>> doesn't work.
>>> This is the example of the article mentioned above:
>>> summary(lmer(WBEING ~ 1 + (1|GRP), data=bh1996))
>>> summary(test<- aov(WBEING~as.factor(GRP),data=bh1996))
>>>  0.04336905
>>> Any suggestions?
>>> Sebastián Daza
>>> sebastian.daza at gmail.com
>>> R-sig-mixed-models at r-project.org mailing list
> Sebastián Daza
> sebastian.daza at gmail.com
More information about the R-sig-mixed-models