[R-sig-ME] compare fit of GLMM with different link/family

Wed Jan 26 03:09:09 CET 2022

   I mostly agree.

   I would say that in general it's OK to compare models with different 
links, families, etc. via AIC *as long as you don't explicitly transform 
the response variable* -- i.e. you have to be careful comparing

  lm(log(y) ~ ....)

with

   lm(y ~ ...)

(you need a Jacobian term in the AIC expression to account for the 
change in scaling of the density), but comparing basically

   glm(y ~ ... , family = <anything>)

should be OK. That said, there is a strong minority view (Phillip may 
belong to this group) that says that using AIC to compare non-nested 
models is *not* OK: e.g. see 
https://stats.stackexchange.com/questions/116935/comparing-non-nested-models-with-aic/116951#116951
https://mathoverflow.net/questions/249448/use-of-akaike-information-criterion-with-nonnested-models

  (Unfortunately, really understanding why this should or should not 
work depends, I think, on understanding the rates of convergence of 
certain asymptotic expressions ...)

   I completely agree with Phillip on the rest, though, which is to say 
that you should think about **why** you want to test all these different 
cases. It's unlikely you're going to be able to frame *scientific* 
hypotheses in terms of these different models ("is it better to measure 
consumption in gallons per mile or miles per gallon?"). If you're purely 
interested in prediction, then I think AIC will often be an adequate 
approximation to something based on cross-validation (but it would be 
good to check with CV). On the other hand, if you're purely interested 
in prediction you might want to move in the direction of nonparametric 
models such as GAMs, which should make many of the distinctions between 
links irrelevant ...

On 1/25/22 12:56 PM, Phillip Alday wrote:
> 
> On 25/1/22 11:04 am, Dries Debeer via R-sig-mixed-models wrote:
>> Dear,
>>
>>
>> I have a question about comparing the fit of GLMM with different link functions/families.
>>
>> For instance, can the deviance or the AIC be used to compare the fit of probit and logit with the same parametrization?
>>
>> probit_model <- glmer(Y ~ A + B + C*D + (A | subjects), data = data, family = binomial(link = "probit"))
>> logit_model <- glmer(Y ~ A + B + C*D + (A | subjects), data = data, family = binomial(link = "logit"))
> 
> This is a surprisingly tough question, in my opinion. Neither the AIC
> nor the deviance depend on the link itself, so in theory, you could
> compare them ... but these models are not nested, and comparing
> non-nested models is generally a tricky problem.  That said, probit and
> logit models will tend to give very similar results in terms of
> predictions/fit to the data. The bigger difference is how you interpret
> coefficients, so I would chose between probit and logit based on desired
> interpretation.
> 
> For other families/links, the comparison can get even more difficult.
> For example, if you compare an inverse link with an identity link, then
> you are comparing two very different albeit related quantities -- like
> comparing a model of "speed" vs "time".
> 
>>
>>
>> And is this also possible when the distributional assumptions are different? For instance:
>>
>> gamma_model <- glmer(X ~ A + B + C*D + (A | subjects), data = data, family = Gamma(link = "inverse"))
>> inverse_gauss <- glmer(X ~ A + B + C*D + (A | subjects), data = data, family = inverse.gaussian(link = "1/mu^2"))
> 
> Not really, no. Both the deviance and the AIC are functions of the log
> likelihood and the choice of family corresponds to a choice of
> likelihood, so you're comparing different things.
> 
> Depending on what you're going for, looking at predictive power of the
> models directly -- such as looking at mean squared or mean absolute
> error computed with cross validation -- might work.
> 
> That said, the choice of family is a statement about your assumptions
> and prior beliefs about the data. In a Bayesian context, McElreath has
> described this as a "prior about the data" in Statistical Rethinking.
> Gelman et al have also noted that the prior can only be understood in
> the context of the likelihood -- all hinting at the core idea here,
> namely that the family is an assumption about the conditional
> distribution of your data (or equivalently, about the the distribution
> of the error/noise in your data).
> 
> My previous point about the choice of link changing interpretation also
> holds for changes in link accompanying changes in family -- the
> statements you can make about your data based on an inverse link vs an
> inverse square link are different.
> 
> I would be happy to hear other opinions here.
> 
> Hope that helps,
> Phillip
> 
>>
>> Thank you!
>> Dries Debeer
>>
>>
>>
>>
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 

-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics