[R-sig-eco] glm-model evaluation

Fri May 30 00:24:24 CEST 2008

The selMod function in package pgirmess will produce AIC/AICc tables
such as those suggested in Anderson 2001 (cited on the help page).  It
runs with objects produced by glm.nb (but I don't have any knowlege as
to whether it is a sensible approach with the glm.nb models).

library(MASS)
example(glm.nb)
library(pgirmess)
selMod(list(quine.nb1, quine.nb2, quine.nb3))

As far as the original question, I think that diagnostics beyond just
reporting a GoF for the global model are important, and I agree with
Ben's suggestions.  Also, I'd add that showing predictive ability is
very important if the goal of the modeling process is to make
predictions (and even if it's not, showing predictive ability provides
support for the model).  Frank Harrell has tools in the Design library
for efficient internal validation and calibration via the bootstrap
(see the 'validate' and 'calibrate' functions) but these will not work
on a model produced by glm.nb.  However it's easy to code a
cross-validation in R and I believe MASS shows a 10-fold
cross-validation for the CPUs example.

Kingsford Jones

On Thu, May 29, 2008 at 2:33 PM, Ben Bolker <bolker at ufl.edu> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Ruben Roa Ureta wrote:
> |> -----BEGIN PGP SIGNED MESSAGE-----
> |> Hash: SHA1
> |>
> |> Ruben Roa Ureta wrote:
> |>
> |> | I have traced the rule about 2 as the minimum difference to favour one
> |> | model over the other to remark 2, Ch. 4, Sakamoto, Ishiguro and
> |> Kitagawa,
> |> | 1986, Akaike Information Criterion Statistics. D. Reídle Publishing Co,
> |> | Dordrecht. They use the expression 'significant difference between
> |> | models'. However, they do not explain why they think that 2 is the
> |> minimum
> |> | 'significant' delta AIC. Does anybody know more about a justification
> |> for
> |> | this threshold?
> |> | Rubén
> |>
> |> ~  I would really strongly recommend AGAINST trying to justify
> |> "significance thresholds" for AIC (B&A 2002 say this too).
> |
> | Note that I used quotes as in 'significant difference between
> | models'. I think the concept of 'significance' as in significance tests
> | does not apply to I-T model selection. I only wanted to know about any
> | justification for the delta AIC=2 rule.
>
> ~  Fair enough. The reason that I (and B&A) react so strongly to the
> use of the word "significance" in this context is that it's nearly
> impossible to prevent people from misinterpreting it in terms of
> classical p-values. It's too bad the word has been tainted so as to make
> it practically unusable in this context, but it has.  (I have a
> similar feeling about calling model weights "probabilities" ...)
>
> ~   cheers
> ~     Ben
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFIPyE7c5UpGjwzenMRAi2MAKCGoFT5BOfg9fb0UW5QlJERVW4YvACfZtYU
> XEiiKO9X/P1W1bZLQ41Gl3I=
> =zQOt
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>