[R-sig-eco] glm-model evaluation

Rafael Maia queirozrafaelmv at yahoo.com.br
Fri May 30 01:08:28 CEST 2008


I just joined the list today, so sorry if this has already been suggested!

Package dRedging (developed by Kamil Bartoń
<http://www.zbs.bialowieza.pl/staff/kbarton>) also produces AIC/AICc
tables, but unlike selMod will compute for all possible model
combinations (but model.avg lets you use only a priori defined models).
It also allows for multi-model inference and model averaging.

it isn't available through CRAN, but you can get it at
http://www.zbs.bialowieza.pl/users/kamil/r/

I have two issues with it, though. First, I'm not sure it handles
adequately as.factor variables, specially in model averaging, since it
gives the estimates and average weights for each factor level when there
are more than 2 (again, maybe its supposed to do so; I'm not all that
versed on model selection). Also, I can't get it to bypass the default
subset (AICc <=4) of the get.model function, without just changing it
for a very large number.

I just compared the results of selMod (from pgirmess package) and
model.avg and dredge (both from dRedging package) with the negative
binomial example (below) and with a lme model of mine, and the results
seem quite consistent. I do recall running the cement example from B&A,
though, with results differing slightly - something to do with how
LogLik is computed, I've been told...

hope it helps!

Best,
Rafael Maia

Kingsford Jones wrote:
> The selMod function in package pgirmess will produce AIC/AICc tables
> such as those suggested in Anderson 2001 (cited on the help page).  It
> runs with objects produced by glm.nb (but I don't have any knowlege as
> to whether it is a sensible approach with the glm.nb models).
>
> library(MASS)
> example(glm.nb)
> library(pgirmess)
> selMod(list(quine.nb1, quine.nb2, quine.nb3))
>
> As far as the original question, I think that diagnostics beyond just
> reporting a GoF for the global model are important, and I agree with
> Ben's suggestions.  Also, I'd add that showing predictive ability is
> very important if the goal of the modeling process is to make
> predictions (and even if it's not, showing predictive ability provides
> support for the model).  Frank Harrell has tools in the Design library
> for efficient internal validation and calibration via the bootstrap
> (see the 'validate' and 'calibrate' functions) but these will not work
> on a model produced by glm.nb.  However it's easy to code a
> cross-validation in R and I believe MASS shows a 10-fold
> cross-validation for the CPUs example.
>
> Kingsford Jones
>
>
> On Thu, May 29, 2008 at 2:33 PM, Ben Bolker <bolker at ufl.edu> wrote:
>   
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Ruben Roa Ureta wrote:
>> |> -----BEGIN PGP SIGNED MESSAGE-----
>> |> Hash: SHA1
>> |>
>> |> Ruben Roa Ureta wrote:
>> |>
>> |> | I have traced the rule about 2 as the minimum difference to favour one
>> |> | model over the other to remark 2, Ch. 4, Sakamoto, Ishiguro and
>> |> Kitagawa,
>> |> | 1986, Akaike Information Criterion Statistics. D. Reídle Publishing Co,
>> |> | Dordrecht. They use the expression 'significant difference between
>> |> | models'. However, they do not explain why they think that 2 is the
>> |> minimum
>> |> | 'significant' delta AIC. Does anybody know more about a justification
>> |> for
>> |> | this threshold?
>> |> | Rubén
>> |>
>> |> ~  I would really strongly recommend AGAINST trying to justify
>> |> "significance thresholds" for AIC (B&A 2002 say this too).
>> |
>> | Note that I used quotes as in 'significant difference between
>> | models'. I think the concept of 'significance' as in significance tests
>> | does not apply to I-T model selection. I only wanted to know about any
>> | justification for the delta AIC=2 rule.
>>
>> ~  Fair enough. The reason that I (and B&A) react so strongly to the
>> use of the word "significance" in this context is that it's nearly
>> impossible to prevent people from misinterpreting it in terms of
>> classical p-values. It's too bad the word has been tainted so as to make
>> it practically unusable in this context, but it has.  (I have a
>> similar feeling about calling model weights "probabilities" ...)
>>
>> ~   cheers
>> ~     Ben
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.6 (GNU/Linux)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>>
>> iD8DBQFIPyE7c5UpGjwzenMRAi2MAKCGoFT5BOfg9fb0UW5QlJERVW4YvACfZtYU
>> XEiiKO9X/P1W1bZLQ41Gl3I=
>> =zQOt
>> -----END PGP SIGNATURE-----
>>
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>>     
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>



More information about the R-sig-ecology mailing list