[R-sig-ME] lmer: problem in crossed random effect model with verydifferent variances
Douglas Bates
bates at stat.wisc.edu
Wed Jun 17 23:05:35 CEST 2009
On Wed, Jun 17, 2009 at 3:54 PM, Michael Li<wuolong at gmail.com> wrote:
> On Wed, Jun 17, 2009 at 4:02 PM, Douglas Bates<bates at stat.wisc.edu> wrote:
>> On Wed, Jun 17, 2009 at 2:48 PM, Doran, Harold<HDoran at air.org> wrote:
>>> Michael
>>
>>> I'll take a quick stab at this, but there is really no way to know what
>>> the issue is given that there is no real description of your data.
>>> First, SAS and lmer use different algorithms for generating parameter
>>> estimates, so it's no surprise that the world does not line up exactly
>>> between the two. However, the estimates should be similar.
>>
>>> Lmer used REML by default. What did you use in SAS, ML or REML
>
> The default is in SAS is also REML. So REML was used for both lmer and
> PROC MIXED.
>
>> What are the values of the REML criterion (or the deviance, if you
>> used ML estimates) at convergence? If they are very close then it is
>> just a matter of the convergence criterion.
>
> The REML criteria in fact were also close at convergence (SAS gives
> 448.45 while lmer says 448.5).
>
>> Consider that the
>> relative variance for the analyst in the SAS fit is less than 0.01.
>> When you estimate a variance relative to the residual variance as 0.01
>> it means "essentially zero"
>>
>> You should read the result from lmer as an estimate of zero. For some
>> reason the optimization software doesn't like to converge on the
>> boundary and often ends up at very small but non-zero values.
>
> Certainly in practice, the difference between 1500 and 10^-12 does not
> really matter
> when the error variance is 200,000. Still, 1500 seems to be more 'palatable'.
> Anyway, it was just surprising to see the difference even though I
> understand that lmer uses
> difference algorithm and did not expect the answers to be exactly the same.
>
> By the way, at log scale, SAS just gives 0 for analyst variance while
> lmer says 10^-15.
>
> I am attaching the code and data at the end of the email in case
> anyone wants to try it out.
>
> Thanks for the clarification.
>
> Michael
>
> R:
> lmer (y ~ (1 | day) + (1 | analyst), data = tmp)
>
> SAS:
> proc mixed data = tmp;
> class day analyst;
> model y = / s;
> random day analyst;
> run;
>
> tmp:
>
> day analyst y
> 1 1 1 5482
> 2 1 1 3285
> 3 1 1 4266
> 4 1 1 3818
> 5 1 1 4159
> 6 2 1 3007
> 7 2 1 3349
> 8 2 1 3178
> 9 2 1 3093
> 10 2 1 3242
> 11 3 1 2495
> 12 3 1 2687
> 13 3 1 2858
> 14 3 1 2090
> 15 3 1 2218
> 16 1 2 3882
> 17 1 2 4522
> 18 1 2 3647
> 19 1 2 3690
> 20 1 2 3754
> 21 2 2 3050
> 22 2 2 2858
> 23 2 2 3178
> 24 2 2 2901
> 25 2 2 2410
> 26 3 2 1855
> 27 3 2 3157
> 28 3 2 2111
> 29 3 2 2431
> 30 3 2 3114
Thanks for sending the data. That clears things up a bit. You only
have two analysts. It is unrealistic to expect to estimate a variance
from two groups. Just as a sanity check you could fit a model with
fixed effects for day and analyst
> str(dat)
'data.frame': 30 obs. of 3 variables:
$ day : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2 ...
$ analyst: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ y : int 5482 3285 4266 3818 4159 3007 3349 3178 3093 3242 ...
> summary(aov(y ~ day * analyst, dat))
Df Sum Sq Mean Sq F value Pr(>F)
day 2 12410291 6205146 27.8892 5.494e-07
analyst 1 237096 237096 1.0656 0.3122
day:analyst 2 219345 109672 0.4929 0.6169
Residuals 24 5339828 222493
> summary(aov(y ~ day + analyst, dat))
Df Sum Sq Mean Sq F value Pr(>F)
day 2 12410291 6205146 29.0212 2.378e-07
analyst 1 237096 237096 1.1089 0.302
Residuals 26 5559173 213814
You can see that the effect of analyst is not significant, either in
the model that allows for interaction or in the additive model.
More information about the R-sig-mixed-models
mailing list