[R-sig-ME] autocorrelation

Mon Mar 31 16:35:50 CEST 2008

[Sorry Mike - not really a reply to your post, but I was unsure where to
jump in with my contribution - hope things are well in CEH-land?]

There was an interesting thread on the main R-Help list, which also
discussed this topic GLMM with some form of correlation in the
residuals.

You can read the thread here:

http://thread.gmane.org/gmane.comp.lang.r.general/109461

At the time I wanted to pipe up and contribute the suggestion that one
could also look at gamm() in package mgcv. I suspect that this would
fail Doug's "egregious kludges" test, but it does allow one to fit
GAMM/GLMM's with the correlation structures and variance functions
familiar to nlme users. It even allows you to use it as a gls()-like
function because of the way the GAM bit is represented in lme - at least
as far as I understand it. You can also specify parametric terms, not
just smooth terms in the model, hence getting GLMM, but I might be
mis-remembering if you need at least 1 smooth term in gamm() here.

At the time I was reading that thread, I wanted to ask about the
approach of gamm() and whether comments made in that thread by Doug
applied more generally (i.e. to gamm() and other "kludges") or
specifically to the lme4 approach to fitting GLMM. But then i got busy
and so left that for a rainy day.

I would very much appreciate any comments either in regard to using
gamm() in the case Alan presents or in regard to the thread I link to
above. As an ecologist this type of thing crops up all the time with our
data, and as such, I found the R-Help thread above very informative
indeed.

All the best,

G

On Mon, 2008-03-31 at 15:06 +0100, Mike Dunbar wrote:
> Am I missing something here? If you need to estimate an
> autocorrelation parameter at say lag 1, why not make yourself a new
> column with your response variable lagged by 1 time unit. Then include
> that as a fixed effect. Clearly there are issues with missing data,
> but I'm not aware that nlme does anything more than you can do
> manually. In fact I wonder if this approach is slightly more flexible,
> as including a random slope for that lagged variable allows it to vary
> between groups, and I'm not aware that this is allowed using the
> in-built structures in nlme. Anything more complex, as I'm continually
> told, there's always Winbugs...
> 
> regards
> 
> Mike D
> 
> 
> 
> >>> "Douglas Bates" <bates at stat.wisc.edu> 30/03/2008 22:43:32 >>>
> On Sat, Mar 29, 2008 at 2:52 PM, Alan Cobo-Lewis <alanc at umit.maine.edu> wrote:
> > Doug Bates writes on r-sig-mixed-models at r-project.org on Saturday, March 29, 2008 at 7:00 AM -0500 wrote about his planned book on multilevel modelling in R:
> >
> >  >I emphasize graphical displays of the data and aspects
> >  >of the fitted models and inferences based on MCMC samples from the
> >  >posterior distribution of the model parameters.
> >
> >  (n)lme handled correlated error terms, but lme4 does not.
> 
> So if you want a model with correlated error terms (in addition to the
> correlation induced by the random effects) then you should use the
> nlme package.
> 
> > Leaving aside the superior algorithms in lme4, this appears to be the major impediment to considering lme4 capabilities as a superset of (n)lme capabilities.
> 
> I don't recall any statements to the effect that the lme4 capabilities
> would be a superset of the nlme capabilities.  It seems that whoever
> made that decision should have informed me of it.
> 
> The development of the lme4 package has been generously funded by
> several grants, the most important of which was an STTR contract that
> we had for 3 years.  The purpose of that contract was to develop a
> package that could fit generalized linear mixed models using the
> Laplace approximation and allowing for crossed or partially crossed
> grouping factors for the random effects.  The development is currently
> funded by another grant specifically to provide for fitting models
> with crossed and partially crossed random effects and with carryover
> of random effects from one time period to another.
> 
> Neither generalized linear mixed models nor models with crossed or
> partially crossed random effects can be fit (well without resorting to
> egregious kludges) with the nlme package.  Even nonlinear mixed models
> as fit by nlme are sub-optimal compared to the methods in lme4.  (lme4
> uses direct optimization of the Laplace approximation to the
> log-likelihood whereas nlme uses an alternating algorithm that Mary
> Lindstrom and I proposed.)
> 
> My priorities are to fulfill the tasks that I proposed for these
> grants and to build the best software that I can.  The beauty of open
> source software is that if your priorities are different, you have
> full access to the sources and you can modify them to fulfill your
> objectives.  So I suggest that you
>  - Continue to use the nlme package if you wish to incorporate
> (additional) correlation structures in models
>  - Design, code, test and document extensions to the lme4 package to
> do so and then submit these changes as patches
>  - Develop your own package so you can have things done the way you
> want them to be done.  You already have access to the lme4 sources so
> a lot of the heavy lifting has been done for you.
> 
> This developing the software is not as easy as it may seem.  There are
> many trade-offs and, at least for me, it takes a lot of effort to
> determine even if it is possible to incorporate various extensions
> harmoniously.  It is possible to model the mean and variance of the
> conditional distribution of the response separately when that
> distribution is multivariate normal.  It is not as easy to do so when
> that distribution is binomial or Poisson or some other distribution
> for a generalized linear mixed model.  Because the first purpose of
> the lme4 package was to allow for generalized linear mixed models I
> did not incorporate (additional) correlation structures and variance
> functions.  I'm not even sure it could be done consistently for GLMMs
> but you are welcome to show us how.
> 
> >  But what do I do if I've got, for example, autocorrelated error terms? Is there a way to "trick" lme4 into handling that (perhaps something analogous to the "random effect variance per treatment group in lmer" thread that David Afshartous and I
> >  participated in)? Is there instead a good argument for ignoring it? It seems like something that would arise in practice in a non-negligible amount of problems in real data. Will the upcoming book give some advice on how to address this?
> >
> >  I can produce self-contained reproducible code if necessary, but I don't think it is.
> >
> >  thanks
> >  alan
> >
> >  _______________________________________________
> >  R-sig-mixed-models at r-project.org mailing list
> >  https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models 
> >
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%