[R-sig-ME] lme vs. lmer and contrasts

Wed Sep 30 17:39:17 CEST 2009

Hi Robert, (and Raldo)
1) No, but when you have correlated random effects, mcmc is still not
working.
2) set up contrasts as in lm I believe the multcomp package works with
lme4, look at glht
3) About as much as the variability in your data, and whether or not
  it is a designed experiment, and whether you think the assumptions of
  the model hold,
  and if the effect size is meaningful and , ......

Which was the point of the whole discussion, and many others on this
list. 
Inference is hard (paraphrasing Barbie).
 <\begin Rant> P-values tend to sweep many important
factors about a model and interpreting them under the umbrella of
significance.
While unfortunately many journals still perpetuate the p-value, it is
possible to 
publish, write reports, and go about doing science without them. <\end
Rant>

for Raldo (from lme4 documentation, see cake)
str(cake)
print(fm1 <- lmer(angle ~ recipe * temperature + (1|recipe:replicate),
cake,
                  REML = FALSE), corr = FALSE)
print(fm2 <- lmer(angle ~ recipe + temperature + (1|recipe:replicate),
cake,
                  REML = FALSE), corr = FALSE)
print(fm3 <- lmer(angle ~ recipe + temp + (1|recipe:replicate), cake,
                  REML = FALSE), corr = FALSE)
anova(fm3, fm2, fm1)

Gives a p-value

Nicholas

> 
> ------------------------------
> 
> Message: 4
> Date: Wed, 30 Sep 2009 10:09:50 +0200
> From: espesser <robert.espesser at lpl-aix.fr>
> Subject: [R-sig-ME]  lme vs. lmer and contrasts
> To: "R-SIG-Mixed-Models at r-project.org"
> 	<R-SIG-Mixed-Models at r-project.org>
> Message-ID: <4AC3124E.2030802 at lpl-aix.fr>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> In the last discussion about "lme vs. lmer"  (see below) , D. Bates 
> adviced to
> use anova between 2 mixed models to test a fixed effect.
> 
> 1) Does it mean that mcmc pvalues have to be disregarded finally ?
> 
> 2) When the fixed effect is a factor, anova returns the whole 
> significance of the factor,
> but no information about  the levels of the factor. How is it possible 
> to get it  ?
> (I am thinking about  contrast analysis for a factor with more than 2 
> levels)
> 
> 3) As a side question:
> The mixed models are said to be robust to unbalanced data.
> Let suppose a 4 levels factor, with a  number of measures which is 
> different for
> each level (or at least for one level). In what extend can I trust 
> outputs of a mixed model with
> typical contrasts (treatment, sdif, ordered) applied to such a factor ?
> 
> Thank you very much for your help
> 
> R. Espesser
> Laboratoire Parole et Langage,
> Universit? de Provence/CNRS,
> 5 av. Pasteur, Aix en provence (France)
> 
> 
> Douglas Bates a ?crit :
> >
> > [...]
> >
> > My general advice to those who are required to produce a p-value for a
> > particular fixed-effects term in a mixed-effects model is to use a
> > likelihood ratio test.  Fit the model including that term using
> > maximum likelihood (i.e. REML = FALSE), fit it again without the term
> > and compare the results using anova.
> >
> > The likelihood ratio statistic will be compared to a chi-squared
> > distribution to get a p-value and this process is somewhat suspect
> > when the degrees of freedom would be small.  However, so many other
> > things could be going wrong when you are fitting complex models to few
> > observations that this may be the least of your worries.
> >
> >
> 
> 
> 
> ------------------------------
> 
> Message: 5
> Date: Wed, 30 Sep 2009 11:45:00 +0200
> From: Raldo Kruger <raldo.kruger at gmail.com>
> Subject: Re: [R-sig-ME] lme vs. lmer
> To: Douglas Bates <bates at stat.wisc.edu>,        Peter Dalgaard
> 	<p.dalgaard at biostat.ku.dk>,	"R-SIG-Mixed-Models at r-project.org"
> 	<R-SIG-Mixed-Models at r-project.org>
> Message-ID:
> 	<30406dd0909300245j78380cfesa28d4e056d99cd82 at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> Hi all, I've been following this thread since it's of interest to the
> analyses i'm currenty doing.
> My question is, how does one do model simplification with lmer (or
> glmer in my case) if there are no p-values (since p-values are used to
> determine which terms to drop, and the drop1 function does not work
> for glmer)?
> 
> And Douglas, could you provide a working example of getting the
> p-values as you described, preferably with glmer (glmer does not have
> the REML=FALSE option)? I understand the 1st part of fitting two
> models, one with and one without the term of interest... So does it
> mean one has to do this for every term in order to get all the
> p-values?
> 
> Many thanks, Raldo
> 
> On 9/29/09, Douglas Bates <bates at stat.wisc.edu> wrote:
> > On Tue, Sep 29, 2009 at 3:58 PM, Peter Dalgaard
> > <p.dalgaard at biostat.ku.dk> wrote:
> >> Ben Bolker wrote:
> >>>
> >>> Douglas Bates wrote:
> >>>>
> >>>> On Tue, Sep 29, 2009 at 1:02 PM, Ben Bolker <bolker at ufl.edu> wrote:
> >>>>
> >>>>> Christopher David Desjardins wrote:
> >>>>>>
> >>>>>> I've started working through Pinheiro & Bates, 2000 and noticed the
> >>>>>> use
> >>>>>> of lme from the nlme package. I am curious if lmer from lme4 has
> >>>>>> superseded lme or if lme still holds its own? The reason I ask is that
> >>>>>> I
> >>>>>> have taken a few classes where we've solely used lmer and just read
> >>>>>> about lme today. If both functions are on equal footing, can the
> >>>>>> p-values from lme be trusted?
> >>>>>> Thanks!
> >>>>>> Chris
> >>>>>
> >>>>> ?You should read the extended discussion of p-values, degrees of
> >>>>> freedom, etc. that is on the R wiki (I think) and referenced from the R
> >>>>> FAQ. ?At least in my opinion, (n)lme is still fine (and indeed
> >>>>> necessary
> >>>>> at this stage for fitting heteroscedastic and correlated models). ?The
> >>>>> df/p-value estimates, however, are "use at your own risk" -- you'll
> >>>>> have
> >>>>> to read the literature and decide for yourself.
> >>>>> ?I still think there's room for someone to implement (at least)
> >>>>> Satterthwaite and (possibly) Kenward-Roger corrections, at least for
> >>>>> the
> >>>>> sake of comparison, but I'm not volunteering.
> >>>>
> >>>> You may need to define them first. ?Many of the formulas in the mixed
> >>>> models literature assume a hierarchical structure in the random
> >>>> effects - certainly we used such a formula for calculating the
> >>>> denominator degrees of freedom in the nlme package. But lme4 allows
> >>>> for fully or partially crossed random effects so you can't think in
> >>>> terms of "levels" of random effects.
> >>>>
> >>>> Referring to the "Satterthwaite and Kenward-Roger corrections" gives
> >>>> the impression that these are well-known formulas and implementing
> >>>> them would be a simple matter of writing a few lines of code. ?I don't
> >>>> think it is. ?I would be very pleased to incorporate such code if it
> >>>> could be written but, as I said, I don't even know if such things are
> >>>> defined in the general case, let alone easy to calculate.
> >>>>
> >>>> I am not trying to be argumentative (although of late I seem to have
> >>>> succeeded in being that). ?I'm just saying that I don't think this is
> >>>> trivial. (It I wanted to be argumentative I would say that it is
> >>>> difficult and, for the most part, irrelevant. :-)
> >>>
> >>> ?Fair enough. Actually, I'm not sure I meant implementing them in lmer
> >>> -- even implementing them in nlme would be useful (and perhaps more
> >>> straightforward, if not trivial). I also wouldn't impose the requirement
> >>> that they have to be feasible for huge data sets -- I'm just curious if
> >>> they can be implemented within lme in a relatively straightforward/
> >>> boneheaded way.
> >>> ? But again, this is very far down my to-do list (and at the edge
> >>> of my abilities) and completely off yours, so unless someone else bites
> >>> it won't happen.
> >>
> >> I don't think (n)lme is all that easy either; you still need to sort out
> >> the
> >> connection between its multilevel formulation and the projection matrices
> >> in
> >> the K-R paper. In both nlme and lme4, an implementation is almost
> >> certainly
> >> possible, although probably complicated and perhaps at the expense of all
> >> computational efficiency.
> >>
> >> One main problem is that even when they can be calculated, the corrections
> >> rely on a normal distribution assumption which is more than likely wrong
> >> in
> >> practice. This isn't any different from ordinary t-tests: once you get
> >> into
> >> the single-digit df regime, you really don't know what you are doing, and
> >> if
> >> there are more than 30 df, the normal approximation works well enough
> >> without the correction.
> >>
> >> Accordingly, I tend to see low df more as a warning flag than as something
> >> that gives accurate p values, and I sometimes wonder whether there is a
> >> way
> >> to raise such a flag more expediently.
> >
> > I agree, wholeheartedly.
> >
> > My general advice to those who are required to produce a p-value for a
> > particular fixed-effects term in a mixed-effects model is to use a
> > likelihood ratio test.  Fit the model including that term using
> > maximum likelihood (i.e. REML = FALSE), fit it again without the term
> > and compare the results using anova.
> >
> > The likelihood ratio statistic will be compared to a chi-squared
> > distribution to get a p-value and this process is somewhat suspect
> > when the degrees of freedom would be small.  However, so many other
> > things could be going wrong when you are fitting complex models to few
> > observations that this may be the least of your worries.
> >
> > I appreciate that for Ben and others in fields like ecology the need
> > to incorporate many different possible terms in models for
> > comparatively small data sets may be inevitable.  But it is also
> > inevitable that the precision of the information one can extract from
> > such small data sets is low.  Reducing such analysis to a set of
> > p-values for various terms and treating these p-values as if they were
> > precisely determined is an oversimplification, even when journal
> > editors insist on such an oversimplification.
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> 
> -- 
> Sent from my mobile device
> 
> Raldo
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> R-sig-mixed-models mailing list
> R-sig-mixed-models at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> End of R-sig-mixed-models Digest, Vol 33, Issue 41
> **************************************************