[R-sig-ME] Calculating main effects

Thu Oct 20 21:11:31 CEST 2011

Iker Vaquero Alba <karraspito at ...> writes:

> 
> 
>   
> Dear Geoff,
> 
>    Thank you so much for your kind reply. Actually, the problem I'm
> finding is more like this:
> 
>       fm.1 <- lmer(Y ~ A*B + C)
> 
>       fm.2 <- lmer(Y ~ A*B)
> 
>       anova(fm.1, fm.2)
> 
>    
> 
>    But when your full model includes C*B. So, I have to remove C*B
> from the model (as C is involved in it) to test the effect of C, 
> or that's what
> I understood.
> 

  The issue of testing main effects (such as C) in the presence of
interactions (such as C:B -- note that C*B is equivalent to
C + B + C:B) is long-standing and contentious.  There are several points
of view:

  * Bates, Venables, and others (google for "Venables Exegeses Linear
Models") seem to advocate stepwise removal of unnecessary (non-significant?
weak?) interaction terms so that main effects can be tested without
ambiguity. This is essentially the point you made above ("I have to
remove C:B before I can test C").
  * **If** continuous predictors are centered or categorical predictors
are coded with sum-to-zero contrasts, then it *may* make sense to
use 'marginal' or what SAS would call 'type III' tests to violate
marginality and test main effects in the presence of interactions
(see Venables' "exegeses" for a hint on how to do this): Schielzeth
2010 also seems to recommend this approach, although he focuses on
continuous predictors
 * However, Wald tests of parameters as provided by summary(), **or**
individual quantiles or highest posterior density intervals on individual
parameters, are marginal, so use them with the appropriate caution
(i.e. the estimates of the main effects depend on the contrasts used
and their comparisons to zero may or may not represent tests of
sensible hypotheses)
 * Be aware that in any case there is no simple way (that I know of)
to get R to construct a model matrix that includes interaction terms
but zeros out the main effects: compare the following

d <- expand.grid(C=factor(1:2),B=factor(1:2))
model.matrix(~C*B,data=d)
model.matrix(~C:B,data=d)
model.matrix(~C*B-B-C,data=d)

>    Anyway, let's see if someone else replies.
> 
>    Thank you very much for your help again.
> 
> --- El jue, 20/10/11, Geoff Brookshire <broog731 at ...> escribió:
> 
> De: Geoff Brookshire <broog731 at ...>
> Asunto: Re: Calculating main effects
> Para: "Iker Vaquero Alba" <karraspito at ...>
> Fecha: jueves, 20 de octubre, 2011 14:45
> 
> Hi Iker,
> 
> Actually, that post of mine was meant to be more a request for
> information than giving it. No one responded, so I'm not sure what
> people's opinions on that are. I've done some reading in the meantime
> that's moved me towards one of those options, though. I recommend
> reading Baayen, Davidson, & Bates (2008) in the Journal of Memory and
> Language to get a better picture of what these models are doing and
> some recommendations for how to test significance values. That issue
> of JML has a couple other articles about mixed-effects models, too.
> 
> My impression, to get to your question, is that the best way of
> looking at significance for specific terms is to leave everything you
> can in the model except the term of interest. So if you have the
> factors A, B, and C, and want to know the influence of C, i think I'd
> do:
> 
> fm.1 <- lmer(Y ~ A*B + C)
> fm.2 <- lmer(Y ~ A*B)
> anova(fm.1, fm.2)
> 
> That way you leave in everything to do with A and B. I choose this
> option not because i know whether or not it gives the most accurate
> p-values (i have no idea if this is the case, or if p-values can be
> described as 'accurate'), but because it seems to be the closest
> conceptually to what's going on in more basic tests. The pvals.fnc
> option is a little different. It would just be to run something like
> this:
> 
> fm.full <- lmer(Y ~ A*B*C)  # This is your full model, with every term
> of potential interest
> library(languageR)
> pvals.fnc(fm.full)
> 
> This function doesn't give you wald chi-sq stats, but bayesian
> confidence intervals (and their associated p-values) using MCMC
> sampling and a non-informative prior. I'm avoiding that option because
> I don't understand everything involved in Bayesian analysis, so I'm
> afraid I'll misuse it. I know this function wasn't working for a
> while, but it works for me now. If you want to take this route, update
> your packages and it should be fine, at least for normally distributed
> data (I think).
> 
> It's possible that some people would recommend something else.
> 
> Good luck!
> 
> geoff
>