[R-sig-ME] calculation of AIC
Ben Bolker
bolker at ufl.edu
Sun Feb 1 05:30:16 CET 2009
I believe the issue is in the output of glm, not the
calculation. Take a look at print.glm and consider the following:
> format(signif(83245.1,4))
[1] "83250"
> format(signif(83249.4,4))
[1] "83250"
orzack wrote:
> I am puzzled by the output of AIC values for glm (yes, this is not
> strictly a mixed model question, except as a special case) but I ask
> it here anyway. My apologies in advance if this has been raised
> before and resolved.
>
> I fit a binomial GLM with a constant:
>
>> conGLM
>
> Call: glm(formula = Sex ~ 1, family = binomial, data = CVS_GG.df,
> subset = CVS_GG_GA.NE.NA_sel)
>
> Coefficients:
> (Intercept)
> 0.05324
>
> Degrees of Freedom: 60080 Total (i.e. Null); 60080 Residual
> Null Deviance: 83250
> Residual Deviance: 83250 AIC: 83250
>
> Note the AIC value.
>
> I next fit a binomial GLM with a constant and a covariate:
>
>> conageGLM
>
> Call: glm(formula = Sex ~ 1 + Gest_Age, family = binomial, data =
> CVS_GG.df, subset = CVS_GG_GA.NE.NA_sel)
>
> Coefficients:
> (Intercept) Gest_Age
> -0.21787 0.02314
>
> Degrees of Freedom: 60080 Total (i.e. Null); 60079 Residual
> Null Deviance: 83250
> Residual Deviance: 83240 AIC: 83250
>
> Note the AIC value. The two models produce the same AIC value.
>
>
> When I calculate the AIC values "by hand" I get
>
> con AIC = -2loglikelihood + 2n = -2*-41623.70 + 2 = 83249.4
> conage AIC = -2*-41620.55 + 4 = 83245.1
>
> It appears that the AIC value produced by glm for conGLM differs from
> the hand value due only to rounding. So far, so good. BUT, the glm
> and hand values of AIC are different (83250 vs. 83245). more than 2
> units. this cannot (should not!) be rounding error
>
> If I ask for the AIC values directly, I get the hand values, save for
> trivial differences:
>
>> AIC(conGLM)
> [1] 83249.39
>
>> AIC(conageGLM)
> [1] 83245.1
>
> Finally, stepAIC produces the hand values (remembering the read
> output correctly, i.e., <none> denotes the unchanged model, not the
> simplest model)
>
>> stepAIC(conageGLM)
> Start: AIC=83245.1
> Sex ~ 1 + Gest_Age
>
> Df Deviance AIC
> <none> 83241 83245
> - Gest_Age 1 83247 83249
>
> Call: glm(formula = Sex ~ 1 + Gest_Age, family = binomial, data =
> CVS_GG.df, subset = CVS_GG_GA.NE.NA_sel)
>
> Coefficients:
> (Intercept) Gest_Age
> -0.21787 0.02314
>
> Degrees of Freedom: 60080 Total (i.e. Null); 60079 Residual
> Null Deviance: 83250
> Residual Deviance: 83240 AIC: 83250
>
> Note the last value of AIC, which is equal to the value produced by
> the direct glm call.
>
> SO, given this, for conageGLM why are the AIC value produced by glm
> and the AIC value produced by hand (or by AIC or by stepAIC) not
> equal?
>
> The simplest explanation is that glm always returns the AIC of the
> constant model (e.g., conGLM) and not the AIC associated with the
> model being fitted if it is different from the constant model (e.g.,
> conageGLM). If this explanation correct? If so, this just so happens
> to be something I have never seen noted and in fact, it would
> contradict my previous usage of glm, where it appears that a call to
> glm returns the AIC for the model being fitted........
>
> So, what is the explanation?
>
>
> many thanks,
>
> S.
--
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
--
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
More information about the R-sig-mixed-models
mailing list