[R-sig-ME] Model selection in GAMM

Joshua Wiley jwiley.psych at gmail.com
Tue Mar 19 00:47:00 CET 2013


 Sorry, I missed the "temporal structures" bit from the OP.  Apologies
for the noise.

On Mon, Mar 18, 2013 at 4:46 PM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
> A bit off from the other topics, but back to the original, if you use
> gamm4, it uses lme4 (lmer/glmer) for the mixed model.  One implication
> is that you can get true likelihoods using the Laplace approximation
> or numerical integration.  So at least for comparing models with
> different parametric effects, it would seem you could use that
> approach.  I am uncertain for the overall model, because you get both
> a gam and mer object.
>
> On Mon, Mar 18, 2013 at 6:29 AM, Ben Bolker <bbolker at gmail.com> wrote:
>> Andrew Robinson <A.Robinson at ...> writes:
>>
>>>
>>> On Sun, Mar 17, 2013 at 07:44:28PM +0000, Ben Bolker wrote:
>>> > Mari Laine <mslain at ...> writes:
>>> >
>>> >
>>> > > I'm using GAMM (mgcv::gamm) to fit non-Gaussian models with several
>>> > > fixed factors and also a temporal structure. Additionally, I'm
>>> > > fitting non-linearity with smoothers. I have several correlated main
>>> > > effect candidates, and I would like to compare their GAMMs and see,
>>> > > which one of the models best fits the data. I would also need to
>>> > > tune the models via dropping unnecessary smoothers (linear ones) and
>>> > > non-significant fixed variables.
>>> >
>>> > > After going through some statistical books, I'm under the impression
>>> > > that one should not use AIC comparison of $lme's for GAMM models -
>>> > > is this correct? Could someone give instruction on the model
>>> > > selection in GAMM or refer me to a book / some other source of
>>> > > information on this matter?
>>> >
>>> >   To my knowledge there are two issues here:
>>> >
>>> > (1) GAMM uses penalized quasi-likelihood.  According to some
>>> > statisticians (including Brian Ripley, who wrote the original PQL
>>> > code in MASS::glmmPQL, which might be what GAMM relies on -- I don't
>>> > remember, it might incorporate its own PQL code), one shouldn't use
>>> > likelihood-based approaches (including AIC) with PQL algorithms,
>>> > because they don't estimate a true likelihood (others say it's OK as
>>> > long as you make sure to scale the likelihood to get a
>>> > quasi-likelihood before combining it with the penalty term to get a
>>> > QIC).
>>>
>>> Hi Ben,
>>>
>>> I know that you're reporting third-party opinions, and you're not
>>> necessarily advocating this position yourself, but I wonder if you can
>>> provide some more information - even a link or a citation to them?
>>
>>   Hmm.  I may be in over my head here, so *please* treat anything
>> I say below as false until proven true ...
>>
>>>
>>> I'm a bit confused about how scaling the maximized penalized
>>> likelihood (MPL) can deliver something that can be treated as though
>>> it were a maximized likelihood.
>>>
>>> To my way of thinking, a point of maximizing the penalized likelihood
>>> is that you get an estimate with better statistical properties than
>>> the MLE.
>>
>>   It's been a long time since I looked at this, and I don't understand
>> it well enough, but it was my understanding that the quantity maximized
>> by PQL is an *approximation* to the marginal likelihood used in all
>> other GLMM applications (i.e. the conditional likelihood of the data
>> given fixed values of the random effects, weighted by the probability
>> of those RE values given the distribution of the REs, integrated over
>> all possible values of the REs) -- one that is less accurate than
>> Laplace approximation and Gauss-Hermite quadrature, but an approximation
>> to the same quantity nevertheless.
>>
>>   So I thought the idea was not (in this case) to get an estimate
>> with better properties than the MLE, but to get an estimate of the
>> marginal likelihood at all ...
>>
>>> It is overwhelmingly likely that this MPL estimate will be different
>>> from the MLE.  It's hard for me to imagine a way that the PL function
>>> can be scaled so that the MPLE can be treated as though it were an
>>> MLE.  If the MPLE won't be the same as the MLE, then the L can't be
>>> maximized at the MPLE.  Further, the PL function will be a different
>>> shape at the optimum (I intuit) than the L would be at its optimum.
>>
>>     For one quick practical counterexample, glmmPQL generally gives
>> _similar_ answers to glmer/glmmADMB etc. (i.e. methods based on
>> Laplace or better approximations) for large data sets and those
>> where the counts per individual sample are large ...
>>>
>>> So, is there theory to suggest that the properties of the MLE that are
>>> relied upon by the various measures of information are retained by the
>>> MPLE?  And if so, even then, wouldn't those properties depend on the
>>> nature of the penalization?
>>
>>    I would love to know the answer.  I hope someone more knowledgeable,
>> or with much more time to spend figuring this out, comes forward.
>>
>>   Ben
>>
>>
>>>
>>> > (2) As I recall it's a little tricky to figure out which components
>>> > of a GAMM call contain which kinds of information about the fit.
>>> > In particular it's not clear whether the likelihood/AIC reported
>>> > in the lme component of the fit really reflect an appropriate
>>> > (quasi)likelihood/IC of the full model; I believe there's a lot
>>> > of detail on this in the ?gamm help page: in particular,
>>> >
>>> > > For example,
>>> > > unlike ‘glmmPQL’ from ‘MASS’ it will return the complete ‘lme’
>>> > > object from the working model at convergence of the PQL iteration,
>>> > > including the `log likelihood', even though this is not the
>>> > > likelihood of the fitted GAMM.
>>> >
>>> > which suggests you shouldn't use that log-likelihood ...
>>> >
>>> >   I would look for further information in Simon Wood's excellent book on
>>> > generalized additive models.
>>> >
>>> > _______________________________________________
>>> > R-sig-mixed-models <at> r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://joshuawiley.com/
> Senior Analyst - Elkhart Group Ltd.
> http://elkhartgroup.com



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://joshuawiley.com/
Senior Analyst - Elkhart Group Ltd.
http://elkhartgroup.com



More information about the R-sig-mixed-models mailing list