[R] what does it mean when my main effect 'disappears' when using lme4?

Thu Aug 19 05:12:06 CEST 2010

On 2010-08-18 18:41, Johan Jackson wrote:
> Hi all,
>
> I figured out why this was happening. It is because my actual code was:
>
> lmer(Y~X + (1|as.factor(labs)),data=DATA)
>
> In this case, the as.factor function looks for object 'labs' not object
> 'DATA$labs.'
>
> Scope is something you hear about don't worry about until it bites you on
> your ass I guess.
>
> JJ

Now I agree with you and I don't think that lmer() should do that.
Confirmed using the sleepstudy data:

  library(lme4)  # lme4_0.999375-34   Matrix_0.999375-42
  sleepstudy$subj <- rep(1:18, each=10)
  fm <- lmer(Reaction ~ Days + (1|as.factor(subj)), data=sleepstudy)
  # Error in inherits(x, "factor") : object 'subj' not found

and, of course, if you have a variable 'subj' in your workspace,
then that will be used. It appears that as.factor() takes precedence
over 'data=', as you surmise.

I haven't had time to look into the lmer code to see what gives and it
may well be a design decision that I'm not aware of. I can't see
anything in the help page that refers to this effect.

   -Peter Ehlers

>
> On Wed, Aug 18, 2010 at 5:52 PM, David Winsemius<dwinsemius at comcast.net>wrote:
>
>>
>> On Aug 18, 2010, at 6:45 PM, Peter Ehlers wrote:
>>
>>   On 2010-08-18 11:49, Johan Jackson wrote:
>>>
>>>> No, apologies (good catch David!), I merely copied the script
>>>> incorrectly.
>>>> It was
>>>>
>>>> lmer(Y~X + (1|labs),data=DATA)
>>>>
>>>> in my original script. So my question still stands: is it expected
>>>> behavior
>>>> for lmer to access the object 'labs' rather than the object 'DATA$labs'
>>>> when
>>>> using the data= argument?
>>>>
>>>> JJ
>>>>
>>>>
>>> I don't think that's expected behaviour, nor do I think that it occurs.
>>> There must be something else going on. Can you produce this with a
>>> small reproducible example?
>>>
>>
>> This makes me wonder if there couldn't be a Wiki page where questioners
>> could be referred that would illustrate the quick and easy construction of
>> examples that could test such theories? I would imagine that in (this
>> instance) the page would start with the data.frame that were on the help
>> page for lmer() (for example) and then put in the workspace a mangled copy
>> of a vector that migh exhibit the pathological structure that might exist in
>> the OP's version of "labs" and then run lmer() to see if such an "unexpected
>> behavior" might be exhibited.
>>
>> Just an idea. (I've never managed to get any R-Wiki contributions accepted
>> through the gauntlet that it puts up.)
>>
>> --
>> David.
>>
>>
>>>   -Peter Ehlers
>>>
>>>
>>>>
>>>>
>>>> On Wed, Aug 18, 2010 at 11:29 AM, David Winsemius<dwinsemius at comcast.net
>>>>> wrote:
>>>>
>>>>
>>>>> On Aug 18, 2010, at 1:19 PM, Johan Jackson wrote:
>>>>>
>>>>>   Hi all,
>>>>>
>>>>>>
>>>>>> Thanks for the replies (including off list).  I have since resolved the
>>>>>> discrepant results. I believe it has to do with R's scoping rules - I
>>>>>> had
>>>>>> an
>>>>>> object called 'labs' and a variable in the dataset (DATA) called
>>>>>> 'labs',
>>>>>> and
>>>>>> apparently (to my surprise), when I called this:
>>>>>>
>>>>>> lmer(Y~X + (1|labs),dataset=DATA)
>>>>>>
>>>>>> lmer was using the object 'labs' rather than the object 'DATA$labs'. Is
>>>>>> this
>>>>>> expected behavior??
>>>>>>
>>>>>>
>>>>> help(lmer, package=lme4)
>>>>>
>>>>> It would be if you use the wrong data argument for lmer(). I doubt that
>>>>> the
>>>>> argument "dataset" would result in lmer processing "DATA".  My guess is
>>>>> that
>>>>> the function also accessed objects "Y" and "X" from the calling
>>>>> environment
>>>>> rather than from within "DATA".
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>   This would have been fine, except I had reordered DATA in the meantime!
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> JJ
>>>>>>
>>>>>> On Tue, Aug 17, 2010 at 7:17 PM, Mitchell Maltenfort<mmalten at gmail.com
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>
>>>>>>   One difference is that the random effect in lmer is assumed --
>>>>>>
>>>>>>> implicitly constrained, as I understand it -- to
>>>>>>> be a bell curve.  The fixed effect model does not have that
>>>>>>> constraint.
>>>>>>>
>>>>>>> How are the values of "labs" effects distributed in your lm model?
>>>>>>>
>>>>>>> On Tue, Aug 17, 2010 at 8:50 PM, Johan Jackson
>>>>>>> <johan.h.jackson at gmail.com>   wrote:
>>>>>>>
>>>>>>>   Hello,
>>>>>>>>
>>>>>>>> Setup: I have data with ~10K observations. Observations come from 16
>>>>>>>> different laboratories (labs). I am interested in how a continuous
>>>>>>>>
>>>>>>>>   factor,
>>>>>>>
>>>>>>>   X, affects my dependent variable, Y, but there are big differences in
>>>>>>>> the
>>>>>>>> variance and mean across labs.
>>>>>>>>
>>>>>>>> I run this model, which controls for mean but not variance
>>>>>>>> differences
>>>>>>>> between the labs:
>>>>>>>> lm(Y ~ X + as.factor(labs)).
>>>>>>>> The effect of X is highly significant (p<   .00001)
>>>>>>>>
>>>>>>>> I then run this model using lme4:
>>>>>>>> lmer(Y~ X + (1|labs)) #controls for mean diffs bw labs
>>>>>>>> lmer(Y~X + (X|labs)) #and possible slope heterogeneity bw labs.
>>>>>>>>
>>>>>>>> For both of these latter models, the effect of X is non-significant
>>>>>>>> (|t|
>>>>>>>>
>>>>>>>>   <
>>>>>>>
>>>>>>>   1.5).
>>>>>>>>
>>>>>>>> What might this be telling me about my data? I guess the second
>>>>>>>> (X|labs)
>>>>>>>>
>>>>>>>>   may
>>>>>>>
>>>>>>>   tell me that there are big differences in the slope across labs, and
>>>>>>>> that
>>>>>>>> the slope isn't significant against the backdrop of 16 slopes that
>>>>>>>> differ
>>>>>>>> quite a bit between each other. Is that right? (Still, the enormous
>>>>>>>> drop
>>>>>>>>
>>>>>>>>   in
>>>>>>>
>>>>>>>   p-value is surprising!). I'm not clear on why the first (1|labs),
>>>>>>>>
>>>>>>>>   however,
>>>>>>>
>>>>>>>   is so discrepant from just controlling for the mean effects of labs.
>>>>>>>>
>>>>>>>> Any help in interpreting these data would be appreciated. When I
>>>>>>>> first
>>>>>>>>
>>>>>>>>   saw
>>>>>>>
>>>>>>>   the data, I jumped for joy, but now I'm muddled and uncertain if I'm
>>>>>>>> overlooking something. Is there still room for optimism (with respect
>>>>>>>> to
>>>>>>>>
>>>>>>>>   X
>>>>>>>
>>>>>>>   affecting Y)?
>>>>>>>>
>>>>>>>> JJ
>>>>>>>>
>>>>>>>>      [[alternative HTML version deleted]]
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> R-help at r-project.org mailing list
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>> PLEASE do read the posting guide
>>>>>>>>
>>>>>>>>   http://www.R-project.org/posting-guide.html
>>>>>>>
>>>>>>>   and provide commented, minimal, self-contained, reproducible code.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>         [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>>
>>>>> David Winsemius, MD
>>>>> West Hartford, CT
>>>>>
>>>>>
>>>>>
>>
>

-- 
Peter Ehlers
University of Calgary