[R-sig-ME] calculation of AIC

Sun Feb 1 17:52:08 CET 2009

>   I believe the issue is in the output of glm, not the
>calculation.  Take a look at print.glm and consider the following:
>
>>  format(signif(83245.1,4))
>[1] "83250"
>>  format(signif(83249.4,4))
>[1] "83250"

>I haven't ever encountered this, but looks like the output is rounding
>to four significant figures. All your degrees of freedom, null
>deviance, residual deviance, and AIC are rounded to four significant
>figures. I'd directly call the degrees of freedom, null deviance, and
>residual deviance (don't know the code, but like you did for AIC(glm
>object)), and see if they are indeed rounded...
>

Dear Ben and Ben,
   Many thanks for your help. (how probable is it that the two 
responses are from Bens.......?)

Indeed, I think you are both correct that the issue is the rounding 
in glm. For what it is worth, this does speak to how even a 
well-vetted (and wonderful!) function like glm can generate an 
"issue".

Of course, the "issue" here is that the AIC values are large enough 
that the default for significant digits generates rounded values that 
are "equal" even though the "real" values differ by more than 2 
support units. The issue is that most people might not go the extra 
length of wondering about the values and would not proceed to 
calculate the AIC values separately. Of course, one response is 
garbage in, garbage out, that is, one should always be wondering 
about what comes out after the button to start a black-box 
calculation is pushed and that the user is responsible. True enough, 
but I can't find any documentation for glm that mentions this 
rounding OR how one might change the default rounding in glm.

Speaking of this, does anybody know how to change the default 
rounding for glm (and lmer) OR for an R session in general (e.g., so 
that a regular call to glm would generate AIC values with more 
digits)?

Finally, perhaps you are wondering how meaningful it is to use the 2 
support unit change in AIC to decide between models when the AIC 
values themselves are so large. To this extent, one might think that 
the rounding convention in glm was implemented with this in mind, 
i.e., "we, the makers of glm, are making sure that you, the user, 
does not use a very small difference for model decision-making when 
the AIC values are so large." Perhaps. But probably not, especially 
given the explicit discussion in Burnham and Anderson (2002, page 71) 
of the meaning of relying on small AIC differences when the AIC 
values themselves are large. They write

People are often surprised that [differences of AIC values] of only 1 
- 10 are very important, when the associated AIC values that led to 
the difference are on the order of 97,000 or 243,000.

They go to write in bold

It is not the absolute size of the AIC values, it is the relative 
values, and particularly the AIC differences that are important.

any help is much appreciated.

S.
-- 
Steven Orzack

The Fresh Pond Research Institute
173 Harvey Street
Cambridge, MA. 02140
617 864-4307

www.freshpond.org