[R-sig-ME] calculation of AIC

Ben Bolker bolker at ufl.edu
Sun Feb 1 05:30:16 CET 2009



  I believe the issue is in the output of glm, not the
calculation.  Take a look at print.glm and consider the following:

> format(signif(83245.1,4))
[1] "83250"
> format(signif(83249.4,4))
[1] "83250"


orzack wrote:
> I am puzzled by the output of AIC values for glm (yes, this is not 
> strictly a mixed model question, except as a special case) but I ask 
> it here anyway. My apologies in advance if this has been raised 
> before and resolved.
> 
> I fit a binomial GLM with a constant:
> 
>>  conGLM
> 
> Call:  glm(formula = Sex ~ 1, family = binomial, data = CVS_GG.df, 
> subset = CVS_GG_GA.NE.NA_sel)
> 
> Coefficients:
> (Intercept) 
>      0.05324 
> 
> Degrees of Freedom: 60080 Total (i.e. Null);  60080 Residual
> Null Deviance:	    83250
> Residual Deviance: 83250	AIC: 83250
> 
> Note the AIC value.
> 
> I next fit a binomial GLM with a constant and a covariate:
> 
>>  conageGLM
> 
> Call:  glm(formula = Sex ~ 1 + Gest_Age, family = binomial, data = 
> CVS_GG.df,      subset = CVS_GG_GA.NE.NA_sel)
> 
> Coefficients:
> (Intercept)     Gest_Age 
>     -0.21787      0.02314 
> 
> Degrees of Freedom: 60080 Total (i.e. Null);  60079 Residual
> Null Deviance:	    83250
> Residual Deviance: 83240	AIC: 83250
> 
> Note the AIC value. The two models produce the same AIC value.
> 
> 
> When I calculate the AIC values "by hand" I get
> 
> con AIC = -2loglikelihood + 2n = -2*-41623.70 + 2 = 83249.4
> conage AIC = -2*-41620.55 + 4 = 83245.1
> 
> It appears that the AIC value produced by glm for conGLM differs from 
> the hand value due only to rounding. So far, so good. BUT, the glm 
> and hand values of AIC are different (83250 vs. 83245). more than 2 
> units. this cannot (should not!) be rounding error
> 
> If I ask for the AIC values directly, I get the hand values, save for 
> trivial differences:
> 
>>  AIC(conGLM)
> [1] 83249.39
> 
>>  AIC(conageGLM)
> [1] 83245.1
> 
> Finally, stepAIC produces the hand values (remembering the read 
> output correctly, i.e., <none> denotes the unchanged model, not the 
> simplest model)
> 
>>  stepAIC(conageGLM)
> Start:  AIC=83245.1
> Sex ~ 1 + Gest_Age
> 
>                        Df Deviance   AIC
> <none>               83241 83245
> - Gest_Age  1     83247 83249
> 
> Call:  glm(formula = Sex ~ 1 + Gest_Age, family = binomial, data = 
> CVS_GG.df,      subset = CVS_GG_GA.NE.NA_sel)
> 
> Coefficients:
> (Intercept)     Gest_Age 
>     -0.21787      0.02314 
> 
> Degrees of Freedom: 60080 Total (i.e. Null);  60079 Residual
> Null Deviance:	    83250
> Residual Deviance: 83240	AIC: 83250
> 
> Note the last value of AIC, which is equal to the value produced by 
> the direct glm call.
> 
> SO, given this, for conageGLM why are the AIC value produced by glm 
> and the AIC value produced by hand (or by AIC or by stepAIC) not 
> equal?
> 
> The simplest explanation is that glm always returns the AIC of the 
> constant model (e.g., conGLM) and not the AIC associated with the 
> model being fitted if it is different from the constant model (e.g., 
> conageGLM). If this explanation correct? If so, this just so happens 
> to be something I have never seen noted and in fact, it would 
> contradict my previous usage of glm, where it appears that a call to 
> glm returns the AIC for the model being fitted........
> 
> So, what is the explanation?
> 
> 
> many thanks,
> 
> S.


-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc



-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc




More information about the R-sig-mixed-models mailing list