[R-sig-ME] glmm with correlated residuals
Jarrod Hadfield
j.hadfield at ed.ac.uk
Sat Dec 22 11:03:06 CET 2012
Hi,
At least for asreml it is possible to fit the AR(1) process in the
G-structure and hence fit the model described, but I agree that it is
not clear what a glmmPQL model is fitting. asreml seems to
underestimate sigma when rho is small, which is to be anticipated, but
rho seems to be well estimated.
rAR1<-function(n,rho, sigma){
# function for simulating from an AR1 process (think this works!)
L<-Diagonal(n)-bandSparse(n,n,-1,list(rep(rho,n-1)))
y<-rnorm(n, 0, sqrt(1-rho^2))
solve(L,y)@x*sigma
}
eta<-rAR1(1000,rho=0.5,sigma=1)
y<-rpois(1000, exp(eta))
m1<-asreml(y~1, random=~ar1v(units), family=asreml.poisson())
summary(m1)$varcomp
gamma component std.error z.ratio constraint
units!units.cor 0.4854854 0.4854854 0.04923992 9.85959 Unconstrained
units!units.var 0.8077873 0.8077873 0.07291496 11.07849 Positive
R!variance 1.0000000 1.0000000 NA NA Fixed
Cheers,
Jarrod
Quoting Ben Bolker <bbolker at gmail.com> on Fri, 21 Dec 2012 15:05:58 -0500:
> The model I had in mind was hierarchical:
>
> Y_i ~ Poisson(lambda_i)
> lambda_i = exp(eta_i)
> eta ~ MVN(mu=X beta, Sigma=sigma^2*C(rho))
>
> where C(rho) is [for example] the correlation matrix corresponding to an
> AR1 model with correlation rho. So the variance-covariance structure of
> Y is *not* AR1 (but presumably there's some reasonable one-to-one
> relationship between the variance-covariance structures of eta and of Y
> -- I'm guessing that the correlation is just diluted by the additional
> Poisson variance; this might be reasonable to approach with a GEE for
> the marginal variance-covariance structure). It also captures (I think)
> the *biological* goal, which is to characterize count data where there
> is evidently serial autocorrelation [which may be of primary interest,
> or a statistical nuisance to be accounted for].
>
> I think that's perfectly well-defined and sensible, and
> straightforward to estimate with Bayesian or frequentist hierarchical
> modeling tools, although I'm happy to be corrected ...
>
> I completely agree that it's not clear what glmmPQL and friends are
> actually doing when one specifies a model like
>
> glmmPQL(y~x,random=...,correlation=corAR1()) ;
>
> I wish someone (not me!) would sit down and figure it out. Dormann et al
> 2007 suggest on the basis of simulation studies that one can use glmmPQL
> in this way, but I don't recall that they offer much of a theoretical
> justification of what model is *actually* being fitted ...
>
> Dormann, Carsten F. et al. 2007. “Methods to Account for Spatial
> Autocorrelation in the Analysis of Species Distributional Data: a
> Review.” Ecography 30 (5): 609–628.
> doi:10.1111/j.2007.0906-7590.05171.x.
> http://dx.doi.org/10.1111/j.2007.0906-7590.05171.x.
>
> Ben B.
>
> On 12-12-21 02:31 PM, Douglas Bates wrote:
>> <rant>
>> I realize I am coming in late to the discussion and I may end up
>> sounding like a grumpy old man spoiling everyone's fun, but I don't
>> think there is a model with the properties being discussed. For a model
>> in which the conditional distribution of the response given the random
>> effects is Gaussian (i.e. a linear mixed-effects model or certain types
>> of nonlinear mixed-effects models as described in Pinheiro and Bates,
>> 2000) it is possible to modify the variance-covariance structure
>> separately from the conditional mean. Doing so will also carry over to
>> the marginal distribution of the response. But this property is not
>> general - it is quite specifically a property of the Gaussian (or
>> "normal") distribution.
>>
>> So, to me, there is no point in discussing whether the parameters in
>> this model should be estimated by quasi-likelihood (i.e. something that
>> sort-of, kind-of looks like a likelihood) or by defining a series of
>> conditional distributions for an MCMC iterative scheme or ... This may
>> sound priggish or the argument of an abstracted theoretician with no
>> connection to actual data analysis but, if you want to define maximum
>> likelihood estimators or whatever criterion you want to use, you must
>> first have a likelihood and to get a likelihood you need a probability
>> model. And, to the best of my knowledge, there is not probability model
>> with these properties.
>>
>> When you don't have a probability model you have moved from statistical
>> analysis to "data mining" algorithms, which is okay but you should be
>> explicit about that and not couch is as if it were a generalized linear
>> mixed model. All you have is an algorithm - you put numbers in and you
>> get numbers out and if you like the numbers you get out then more power
>> to you
>>
>> To me this is one of the big pitfalls of generalized linear mixed models
>> - there is an assumption that they generalize all the properties of
>> linear mixed models but they don't. There are properties of the
>> Gaussian distribution that do not generalize and for years people have
>> concentrated on the details of how to fit or approximately fit models
>> that don't exist. Concentrating on the low-level details of the
>> mechanism without considering whether such a model could even exist is
>> bound to lead you into trouble.
>> </rant>
>>
>>
>>
>>
>> On Thu, Dec 20, 2012 at 6:59 PM, Adam Smith <raptorbio at hotmail.com
>> <mailto:raptorbio at hotmail.com>> wrote:
>>
>>
>> I believe the gamm function in mgcv can also fit this model. IIRC,
>> the function also permits use of the negative binomial distribution
>> (but you have to provide theta)...
>> It also uses PQL, so the same concerns apply.
>> Regards,
>> Adam
>>
>> > Date: Thu, 20 Dec 2012 22:02:23 +0000
>> > From: j.hadfield at ed.ac.uk <mailto:j.hadfield at ed.ac.uk>
>> > To: bbolker at gmail.com <mailto:bbolker at gmail.com>
>> > CC: r-sig-mixed-models at r-project.org
>> <mailto:r-sig-mixed-models at r-project.org>
>> > Subject: Re: [R-sig-ME] glmm with correlated residuals
>> >
>> > Hi Emilio,
>> >
>> > asreml will fit such a model but the PQL method it uses for
>> > approximating the likelihood can give dodgy results in some cases. For
>> > Poisson data, particularly if it has high mean, it might be OK though.
>> > It is fast enough that you can check the degree of bias using
>> > simulation.
>> >
>> > You can fit spatial simultaneous autoregressive lag models in MCMCglmm
>> > but only for Gaussian data. It would be very difficult for me to
>> > implement these for non-Gaussian data.
>> >
>> > AR(1) models with nugget effect could be fitted, but only if the
>> > correlation parameter is known. The inverse of the correlation matrix
>> > could then be passed to ginverse and the scale estimated, but this is
>> > not that much use I guess.
>> >
>> >
>> > Cheers,
>> >
>> > Jarrod
>> >
>> >
>> >
>> >
>> >
>> > Quoting Ben Bolker <bbolker at gmail.com <mailto:bbolker at gmail.com>>
>> on Thu, 20 Dec 2012 20:53:50
>> > +0000 (UTC):
>> >
>> > > Joshua Wiley <jwiley.psych at ...> writes:
>> > >
>> > >>
>> > >> Hi Emilio,
>> > >>
>> > >> I would suggest doing it in MCMCglmm. It can handle residual
>> > >> structures too. This is a nice starting guide:
>> > >>
>>
>> http://cran.r-project.org/web/packages/MCMCglmm/vignettes/CourseNotes.pdf
>> > >>
>> > >> Cheers,
>> > >>
>> > >> Josh
>> > >
>> > > Josh, are you sure? I don't find "autoreg" anywhere in the
>> course notes.
>> > > I don't personally know of an easy way to do this other than
>> using INLA
>> > > (which I haven't tried much); glmmPQL (easy but perhaps dicey
>> depending
>> > > on the circumstances, and sometimes hard to figure out whether it's
>> > > fitting a well-defined model); or hand-coding in
>> WinBUGS/JAGS/Stan or
>> > > AD Model Builder ... You could also give up on Poisson-ness
>> (you said
>> > > there was heterogeneity of variance -- not quite sure what that
>> means
>> > > in this context?) and fit a GLS model with appropriate variance
>> > > structure (using gls() with weights= and correlation= arguments
>> set).
>> > >
>> > > I'd love to hear other answers.
>> > >
>> > > Ben
>> > >
>> > >
>> > >>
>> > >> On Thu, Dec 20, 2012 at 8:37 AM, Emilio A. Laca <ealaca at ...> wrote:
>> > >> > Fellow R users,
>> > >
>> > >> > what package would you recommend for fitting a poisson glmm with
>> > > one random effect and residuals that are correlated (most likely
>> > > AR(1)) and have heterogeneity of variance? I have successfully
>> > > fitted the model without addressing the structured residuals in
>> > > MCMCglmm. I would appreciate it very much if you could point me
>> > > in the direction of an example.
>> > >
>> > >> > Emilio A. Laca, Professor
>> > >
>> > > [snip]
>> > >
>> > > _______________________________________________
>> > > R-sig-mixed-models at r-project.org
>> <mailto:R-sig-mixed-models at r-project.org> mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> > >
>> > >
>> >
>> > --
>> > The University of Edinburgh is a charitable body, registered in
>> > Scotland, with registration number SC005336.
>> >
>> > _______________________________________________
>> > R-sig-mixed-models at r-project.org
>> <mailto:R-sig-mixed-models at r-project.org> mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org
>> <mailto:R-sig-mixed-models at r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>
>
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list