[R-sig-ME] Chisq = 0 and p = 1 with nested lmer models

Thu Sep 29 00:19:51 CEST 2011

Oops! Change

fm.F1.2a <- lmer(outcome ~ F2*F3 - F1:F2:F3 + (1 | subject) + (1 | item),
  data = samp.dat, REML = FALSE)

to

fm.F1.2a <- lmer(outcome ~ F2*F3 + (1 | subject) + (1 | item),
  data = samp.dat, REML = FALSE)

Sorry for the confusion!

Geoff

On Wed, Sep 28, 2011 at 6:13 PM, Geoff Brookshire
<broog731 at newschool.edu> wrote:
> Hi again,
>
> After an education-filled delay, this makes perfect sense.  I. White
> pointed out to me that:
> "it will often make sense to compare a quadratic
>
> b0 + b1*x + b2*x^2
>
> with b0 + b1*x, but when would it make sense to compare it with
>
> b0 + b2*x^2 ?
>
> It all has to do with marginality: 'interactions should not be included
> without main effects, nor higher-degree polynomial terms without their
> lower-degree relatives' (McCullagh and Nelder, Generalized Linear Models
>  2nd ed p89). See also Nelder (1977) J Roy Stat Soc A 140 48-77,
> especially p49-50."
>
>
> This leads to a further question -- what's the recommended way to get
> at those lower-order effects? One easy way is to just use pvals.fnc
> from languageR on the full model.
>
> Alternatively, we could compare a fitted model containing on the main
> effect of interest (and random effects) against a model containing
> only an intercept, like so:
> fm.F1.1 <- lmer(outcome ~ F1 + (1 | subject) + (1 | item),
>   data = samp.dat, REML = FALSE)
> fm.F1.2 <- lmer(outcome ~ 1 + (1 | subject) + (1 | item),
>   data = samp.dat, REML = FALSE)
> anova(fm.F1.1, fm.F1.2)
>
> ... or compare two models additionally containing everything of
> interest that doesn't involve that main effect, like so:
> fm.F1.1a <- lmer(outcome ~ F1 + F2*F3 + (1 | subject) + (1 | item),
>   data = samp.dat, REML = FALSE)
> fm.F1.2a <- lmer(outcome ~ F2*F3 - F1:F2:F3 + (1 | subject) + (1 | item),
>   data = samp.dat, REML = FALSE)
> anova(fm.F1.1a, fm.F1.2a)
>
> The second of these model comparisons gives a p-value much closer to
> the pMCMC output by pvals.fnc, but both are a bit more significant.
> Does anyone have a recommendation about which of these is most
> trustworthy / statistically valid?
>
> Thanks,
> Geoff
>
> On Fri, Sep 16, 2011 at 6:18 PM, Ben Bolker <bbolker at gmail.com> wrote:
>> Geoff Brookshire <creepingbrain at ...> writes:
>>
>>>
>>> Dear all,
>>>
>>> I'm using lme4 to analyze some reaction time data I've collected, and
>>> am running into an odd problem. I'm not certain if it's a bug or a
>>> problem with my thinking, but hopefully (and probably) it's the latter.
>>>
>>> When trying to compare two models, one of which is the same as the
>>> other but with one term removed, the result is that there is no
>>> difference between the models: Chisq = 0, Chisq Df = 0, and p = 1.
>>>
>>> A little about the design:
>>> This is a reaction-time experiment with a 2x2x2 design over factors
>>> F1, F2, and F3. F1 is varied within-subjects and within-items, and F2
>>> & F3 are varied within-subjects and between-items.
>>>
>>> So far, I've been testing the models like this:
>>>
>>> fm.1 <- lmer(outcome ~ F1*F2*F3 + (1 | subject) + (1 | item),
>>>    data = samp.dat, REML = FALSE)
>>> fm.2 <- lmer(outcome ~ F1*F2*F3 - F1:F2:F3 + (1 | subject) + (1 | item),
>>>    data = samp.dat, REML = FALSE)
>>> anova(fm.1, fm.2)
>>>
>>> This works, and gives what we expect. When I try to test the 2-term
>>> interactions, though, the error comes up:
>>>
>>> fm.3 <- lmer(outcome ~ F1*F2*F3 - F1:F2 + (1 | subject) + (1 | item),
>>>    data = samp.dat, REML = FALSE)
>>> anova(fm.1, fm.3)
>>>
>>> This gives the following output:
>>>
>>> fm.1: outcome ~ F1 * F2 * F3 + (1 | subject) + (1 | item)
>>> fm.3: outcome ~ F1 * F2 * F3 - F1:F2 + (1 | subject) + (1 | item)
>>>      Df     AIC     BIC logLik Chisq Chi Df Pr(>Chisq)
>>> fm.1  11 -677.34 -617.72 349.67
>>> fm.3  11 -677.34 -617.72 349.67     0      0          1
>>>
>>> The same thing happens when looking at F2:F3 and F1:F3. Any ideas?
>>> Just entering anova(fm.1) gives sensible results and F-values for the
>>> 2-way interactions.
>>
>>  When you subtract F1:F2 from F1*F2*F3 you are still
>> going to get a fully specified model, it will just be reparameterized
>> (i.e. you won't reduce the number of parameters this way). This is
>> because of the way that R interprets model formulae (i.e. it's more
>> basic than lme4).
>>
>>  Here's some an experiment confirming this:
>>
>>> d <- expand.grid(F1=LETTERS[1:2],F2=letters[1:2],F3=factor(1:2))
>>> d
>>  F1 F2 F3
>> 1  A  a  1
>> 2  B  a  1
>> 3  A  b  1
>> 4  B  b  1
>> 5  A  a  2
>> 6  B  a  2
>> 7  A  b  2
>> 8  B  b  2
>>> model.matrix(~F1*F2*F3,d)
>>  (Intercept) F1B F2b F32 F1B:F2b F1B:F32 F2b:F32 F1B:F2b:F32
>> 1           1   0   0   0       0       0       0           0
>> 2           1   1   0   0       0       0       0           0
>> 3           1   0   1   0       0       0       0           0
>> 4           1   1   1   0       1       0       0           0
>> 5           1   0   0   1       0       0       0           0
>> 6           1   1   0   1       0       1       0           0
>> 7           1   0   1   1       0       0       1           0
>> 8           1   1   1   1       1       1       1           1
>>
>>
>>> model.matrix(~F1*F2*F3-F1:F2,d)
>>  (Intercept) F1B F2b F32 F1B:F32 F2b:F32 F1B:F2b:F31 F1B:F2b:F32
>> 1           1   0   0   0       0       0           0           0
>> 2           1   1   0   0       0       0           0           0
>> 3           1   0   1   0       0       0           0           0
>> 4           1   1   1   0       0       0           1           0
>> 5           1   0   0   1       0       0           0           0
>> 6           1   1   0   1       1       0           0           0
>> 7           1   0   1   1       0       1           0           0
>> 8           1   1   1   1       1       1           0           1
>>
>>  If you have really thought this through and know that you
>> have a sensible hypothesis corresponding to setting parameters
>> 5 through 7 of the original parameterization to zero, you can do
>> it ...
>>
>> d <- transform(d,F3way=as.numeric((F1:F2:F3=="B:b:2")))
>> model.matrix(~F1+F2+F3+F3way,data=d)
>>
>> A more tedious but general way to do this is to take the
>> original model matrix, pull out the bits you want, rename
>> them if necessary (colons in names will not play nicely
>> with formulae) and use the numeric dummy variables as
>> predictors in a new model.
>>
>>  Hope that helps.
>>
>>  Ben Bolker
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>