[R-sig-ME] very basic HLM question

Sebastián Daza sebastian.daza at gmail.com
Mon Feb 7 19:04:37 CET 2011


Thank you Douglas.
Do you know if there is a way to get the between-component variance (in 
my example equal to 59.004) through random effects Anova using R? I just 
want to get the ICC using the simple formula: ICC = variance between 
macro units / total variance.

Regards.

On 2/7/2011 11:44 AM, Douglas Bates wrote:
> 2011/2/7 Sebastián Daza<sebastian.daza at gmail.com>:
>> Hi everyone,
>> I need to get a between-component variance (e.g. random effects Anova) in
>> order to compute by hand the intraclass correlation. However, using lmer I
>> don't get the same results (variance component) than using random effects
>> Anova (with SPSS). I am using a database of students, clustered on schools
>> (there is not the same number of students by school).
>
> I think the issue is that the estimates of the variance components are
> different, which would then lead to different estimates of the ICC.
>
> You mentioned in your original posting that the data are unbalanced
> (i.e. there are different numbers of students in different schools).
> The estimates returned from lmer are the REML estimates in this case.
> Do you know how the other estimates are being calculated?  I'm not
> sure that the naive calculation of within- and between- sums of
> squares and equating expected mean squares with observed mean squares
> works with unbalanced data.
>
>> First, I get the intraclass using ICC1 in R:
>
> As Paul indicated, it is more informative to say, "using ICC1 from the
> multilevel package for R".  ICC1 is not part of the base R nor the
> required packages.
>
>> summary(anova1<- aov(math ~ as.factor(schoolid), data=nels88))
>> ICC1(anova1)
>>
>> [1] 0.4414491
>
> That is using the raw mean squares, not the estimates of the variance
> components.  I have, thankfully, forgotten most of what I know about
> expected and observed mean squares but I am pretty sure that those
> don't correspond to the REML estimates, even in the balanced case.
>
> Consider the results from the Dyestuff data, which is part of the lme4
> package and is balanced
>
>> print(fm1<- lmer(Yield ~ 1|Batch, Dyestuff))
> Linear mixed model fit by REML ['merMod']
> Formula: Yield ~ 1 | Batch
>     Data: Dyestuff
> REML criterion at convergence: 319.6543
>
> Random effects:
>   Groups   Name        Variance Std.Dev.
>   Batch    (Intercept) 1764     42.00
>   Residual             2451     49.51
> Number of obs: 30, groups: Batch, 6
>
> Fixed effects:
>              Estimate Std. Error t value
> (Intercept)  1527.50      19.38    78.8
>> VarCorr(fm1)
> $Batch
>              (Intercept)
> (Intercept)     1764.05
> attr(,"stddev")
> (Intercept)
>      42.0006
> attr(,"correlation")
>              (Intercept)
> (Intercept)           1
>
> attr(,"sc")
> [1] 49.5101
>> vc<- VarCorr(fm1)
>> var1hat<- as.vector(vc$Batch)
>> var1hat
> [1] 1764.05
>> varhat<- attr(vc, "sc")^2
>> varhat
> [1] 2451.25
>> var1hat/(var1hat + varhat)
> [1] 0.4184874
>> anova1<- aov(Yield ~ 1|Batch, Dyestuff)
> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
>    0 (non-NA) cases
> In addition: Warning message:
> In Ops.factor(1, Batch) : | not meaningful for factors
>> anova1<- aov(Yield ~ Batch, Dyestuff)
>> summary(anova1)
>              Df Sum Sq Mean Sq F value   Pr(>F)
> Batch        5  56358 11271.5  4.5983 0.004398
> Residuals   24  58830  2451.2
>
> You can see that the estimates of the residual variance are the same
> for the two model fits but the mean square for batch is not the
> estimate of the between-batch variance.
>
>
>>
>> ICC1 comes from the multilevel package. Using lmer I get:
>>
>> mod1<- lmer(math ~ 1 + (1|schoolid), data=nels88)
>>
>> Linear mixed model fit by REML
>> Formula: math ~ 1 + (1 | schoolid)
>>    Data: nels88
>>   AIC  BIC logLik deviance REMLdev
>>   1878 1888 -935.8     1875    1872
>> Random effects:
>>   Groups   Name        Variance Std.Dev.
>>   schoolid (Intercept) 34.011   5.8319
>>   Residual                72.256   8.5003
>> Number of obs: 260, groups: schoolid, 10
>>
>> Fixed effects:
>>             Estimate Std. Error t value
>> (Intercept)   48.861      1.927   25.36
>>
>> The random intercept effect is 34.011. If I compute the ICC manually
>> (according to some articles I have read) I get:
>>
>>> 34.011/(34.011+72.256)
>> [1] 0.3200523
>>
>> According to my Anova analysis, the between-component variance should be
>> 59.004. If I use 59.004 the formula works relatively well:
>>
>>   59.004/(59.004+72.256)
>> [1] 0.449520037
>>
>> Not equal, but at least similar in comparison with ICC from ICC1 command.
>> .
>> Does anyone know why I have got those differences? How can I get the 59.004
>> figure using R? I should get practically an identical result than using ICC1
>> but I don't. I have used the database of that article and I have got an
>> identical result between ICCs. But with the dataset I am using (nels88)
>> doesn't work.
>>
>> This is the example of the article mentioned above:
>>
>> library(multilevel)
>> base(bh1996)
>>
>> summary(lmer(WBEING ~ 1 + (1|GRP), data=bh1996))
>>
>> 0.035801/(0.035801+0.789497)
>> 0.043379482
>>
>> summary(test<- aov(WBEING~as.factor(GRP),data=bh1996))
>> ICC1(test)
>> [1] 0.04336905
>>
>> Any suggestions?
>>
>> --
>> Sebastián Daza
>> sebastian.daza at gmail.com
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>

-- 
Sebastián Daza
sebastian.daza at gmail.com




More information about the R-sig-mixed-models mailing list