[R-sig-ME] glmm with correlated residuals

Sat Dec 22 11:03:06 CET 2012

Hi,

At least for asreml it is possible to fit the AR(1) process in the  
G-structure and hence fit the model described, but I agree that it is  
not clear what a glmmPQL model is fitting. asreml seems to  
underestimate sigma when rho is small, which is to be anticipated, but  
rho seems to be well estimated.

rAR1<-function(n,rho, sigma){

  # function for simulating from an AR1 process (think this works!)

L<-Diagonal(n)-bandSparse(n,n,-1,list(rep(rho,n-1)))
y<-rnorm(n, 0, sqrt(1-rho^2))

solve(L,y)@x*sigma

}

eta<-rAR1(1000,rho=0.5,sigma=1)
y<-rpois(1000, exp(eta))

m1<-asreml(y~1, random=~ar1v(units), family=asreml.poisson())

summary(m1)$varcomp
                     gamma component  std.error  z.ratio    constraint
units!units.cor 0.4854854 0.4854854 0.04923992  9.85959 Unconstrained
units!units.var 0.8077873 0.8077873 0.07291496 11.07849      Positive
R!variance      1.0000000 1.0000000         NA       NA         Fixed

Cheers,

Jarrod

Quoting Ben Bolker <bbolker at gmail.com> on Fri, 21 Dec 2012 15:05:58 -0500:

>   The model I had in mind was hierarchical:
>
>   Y_i ~ Poisson(lambda_i)
>   lambda_i = exp(eta_i)
>   eta ~ MVN(mu=X beta, Sigma=sigma^2*C(rho))
>
> where C(rho) is [for example] the correlation matrix corresponding to an
> AR1 model with correlation rho. So the variance-covariance structure of
> Y is *not* AR1 (but presumably there's some reasonable one-to-one
> relationship between the variance-covariance structures of eta and of Y
> -- I'm guessing that the correlation is just diluted by the additional
> Poisson variance; this might be reasonable to approach with a GEE for
> the marginal variance-covariance structure).  It also captures (I think)
> the *biological* goal, which is to characterize count data where there
> is evidently serial autocorrelation [which may be of primary interest,
> or a statistical nuisance to be accounted for].
>
>   I think that's perfectly well-defined and sensible, and
> straightforward to estimate with Bayesian or frequentist hierarchical
> modeling tools, although I'm happy to be corrected ...
>
>   I completely agree that it's not clear what glmmPQL and friends are
> actually doing when one specifies a model like
>
> glmmPQL(y~x,random=...,correlation=corAR1())  ;
>
> I wish someone (not me!) would sit down and figure it out. Dormann et al
> 2007 suggest on the basis of simulation studies that one can use glmmPQL
> in this way, but I don't recall that they offer much of a theoretical
> justification of what model is *actually* being fitted ...
>
> Dormann, Carsten F. et al. 2007. “Methods to Account for Spatial
> Autocorrelation in the Analysis of Species Distributional Data: a
> Review.” Ecography 30 (5): 609–628.
> doi:10.1111/j.2007.0906-7590.05171.x.
> http://dx.doi.org/10.1111/j.2007.0906-7590.05171.x.
>
>  Ben B.
>
> On 12-12-21 02:31 PM, Douglas Bates wrote:
>> <rant>
>> I realize I am coming in late to the discussion and I may end up
>> sounding like a grumpy old man spoiling everyone's fun, but I don't
>> think there is a model with the properties being discussed.  For a model
>> in which the conditional distribution of the response given the random
>> effects is Gaussian (i.e. a linear mixed-effects model or certain types
>> of nonlinear mixed-effects models as described in Pinheiro and Bates,
>> 2000) it is possible to modify the variance-covariance structure
>> separately from the conditional mean.  Doing so will also carry over to
>> the marginal distribution of the response.  But this property is not
>> general - it is quite specifically a property of the Gaussian (or
>> "normal") distribution.
>>
>> So, to me, there is no point in discussing whether the parameters in
>> this model should be estimated by quasi-likelihood (i.e. something that
>> sort-of, kind-of looks like a likelihood) or by defining a series of
>> conditional distributions for an MCMC iterative scheme or ...  This may
>> sound priggish or the argument of an abstracted theoretician with no
>> connection to actual data analysis but, if you want to define maximum
>> likelihood estimators or whatever criterion you want to use, you must
>> first have a likelihood and to get a likelihood you need a probability
>> model.  And, to the best of my knowledge, there is not probability model
>> with these properties.
>>
>> When you don't have a probability model you have moved from statistical
>> analysis to "data mining" algorithms, which is okay but you should be
>> explicit about that and not couch is as if it were a generalized linear
>> mixed model.  All you have is an algorithm - you put numbers in and you
>> get numbers out and if you like the numbers you get out then more power
>> to you
>>
>> To me this is one of the big pitfalls of generalized linear mixed models
>> - there is an assumption that they generalize all the properties of
>> linear mixed models but they don't.  There are properties of the
>> Gaussian distribution that do not generalize and for years people have
>> concentrated on the details of how to fit or approximately fit models
>> that don't exist.  Concentrating on the low-level details of the
>> mechanism without considering whether such a model could even exist is
>> bound to lead you into trouble.
>> </rant>
>>
>>
>>
>>
>> On Thu, Dec 20, 2012 at 6:59 PM, Adam Smith <raptorbio at hotmail.com
>> <mailto:raptorbio at hotmail.com>> wrote:
>>
>>
>>     I believe the gamm function in mgcv can also fit this model.  IIRC,
>>     the function also permits use of the negative binomial distribution
>>     (but you have to provide theta)...
>>     It also uses PQL, so the same concerns apply.
>>     Regards,
>>     Adam
>>
>>     > Date: Thu, 20 Dec 2012 22:02:23 +0000
>>     > From: j.hadfield at ed.ac.uk <mailto:j.hadfield at ed.ac.uk>
>>     > To: bbolker at gmail.com <mailto:bbolker at gmail.com>
>>     > CC: r-sig-mixed-models at r-project.org
>>     <mailto:r-sig-mixed-models at r-project.org>
>>     > Subject: Re: [R-sig-ME] glmm with correlated residuals
>>     >
>>     > Hi Emilio,
>>     >
>>     > asreml will fit such a model but the PQL method it uses for
>>     > approximating the likelihood can give dodgy results in some cases. For
>>     > Poisson data, particularly if it has high mean, it might be OK though.
>>     > It is fast enough that you can check the degree of bias using
>>     > simulation.
>>     >
>>     > You can fit spatial simultaneous autoregressive lag models in MCMCglmm
>>     > but only for Gaussian data. It would be very difficult for me to
>>     > implement these for  non-Gaussian data.
>>     >
>>     > AR(1) models with nugget effect could be fitted, but only if the
>>     > correlation parameter is known. The inverse of the correlation matrix
>>     > could then be passed to ginverse and the scale estimated, but this is
>>     > not that much use I guess.
>>     >
>>     >
>>     > Cheers,
>>     >
>>     > Jarrod
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > Quoting Ben Bolker <bbolker at gmail.com <mailto:bbolker at gmail.com>>
>>     on Thu, 20 Dec 2012 20:53:50
>>     > +0000 (UTC):
>>     >
>>     > > Joshua Wiley <jwiley.psych at ...> writes:
>>     > >
>>     > >>
>>     > >> Hi Emilio,
>>     > >>
>>     > >> I would suggest doing it in MCMCglmm.  It can handle residual
>>     > >> structures too.  This is a nice starting guide:
>>     > >>
>>      
>> http://cran.r-project.org/web/packages/MCMCglmm/vignettes/CourseNotes.pdf
>>     > >>
>>     > >> Cheers,
>>     > >>
>>     > >> Josh
>>     > >
>>     > >   Josh, are you sure?  I don't find "autoreg" anywhere in the
>>     course notes.
>>     > > I don't personally know of an easy way to do this other than
>>     using INLA
>>     > > (which I haven't tried much); glmmPQL (easy but perhaps dicey
>>     depending
>>     > > on the circumstances, and sometimes hard to figure out whether it's
>>     > > fitting a well-defined model); or hand-coding in
>>     WinBUGS/JAGS/Stan or
>>     > > AD Model Builder ...  You could also give up on Poisson-ness
>>     (you said
>>     > > there was heterogeneity of variance -- not quite sure what that
>>     means
>>     > > in this context?) and fit a GLS model with appropriate variance
>>     > > structure (using gls() with weights= and correlation= arguments
>>     set).
>>     > >
>>     > >   I'd love to hear other answers.
>>     > >
>>     > >   Ben
>>     > >
>>     > >
>>     > >>
>>     > >> On Thu, Dec 20, 2012 at 8:37 AM, Emilio A. Laca <ealaca at ...> wrote:
>>     > >> > Fellow R users,
>>     > >
>>     > >> > what package would you recommend for fitting a poisson glmm with
>>     > >     one random effect and residuals that are correlated (most likely
>>     > >     AR(1)) and have heterogeneity of variance?  I have successfully
>>     > >     fitted the model without addressing the structured residuals in
>>     > >     MCMCglmm.  I would appreciate it very much if you could point me
>>     > >     in the direction of an example.
>>     > >
>>     > >> > Emilio A. Laca, Professor
>>     > >
>>     > >  [snip]
>>     > >
>>     > > _______________________________________________
>>     > > R-sig-mixed-models at r-project.org
>>     <mailto:R-sig-mixed-models at r-project.org> mailing list
>>     > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>     > >
>>     > >
>>     >
>>     > --
>>     > The University of Edinburgh is a charitable body, registered in
>>     > Scotland, with registration number SC005336.
>>     >
>>     > _______________________________________________
>>     > R-sig-mixed-models at r-project.org
>>     <mailto:R-sig-mixed-models at r-project.org> mailing list
>>     > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>             [[alternative HTML version deleted]]
>>
>>     _______________________________________________
>>     R-sig-mixed-models at r-project.org
>>     <mailto:R-sig-mixed-models at r-project.org> mailing list
>>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>
>
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.