[R] comparing glm models - lower AIC but insignificant coefficients

Constantinos Antoniou antoniou at central.ntua.gr
Mon May 23 20:59:11 CEST 2005


Hello,

I am a new R user and I am trying to estimate some generalized linear  
models (glm). I am trying to compare a model with a gaussian  
distribution and an identity link function, and a poisson model with  
a log link function. My problem is that while the gaussian model has  
significantly lower (i.e. "better") AIC (Akaike Information  
Criterion) most of the coefficients are not significant. On the other  
hand, the poisson model has a higher (i.e. "worse") AIC, but almost  
all the coefficients are extremely significant (expect for one that  
still has p=0.07).

Summary output of the two models follows... [sorry for the large  
number of independent variables, but the issue is less pronounced  
with fewer covariates].

My question is two-fold:
- AIC supposedly can be used to compare non-nested models (although  
there are concerns and I have also seen a couple in this list's  
archives). Is this a case where AIC is not a good measure to compare  
the two models? If so, is there another measure (besides choosing the  
model with the significant coefficients)? [These are time-series  
data, so I am also looking at acf/pacf of the residuals].
- Could the very high significance of the coefficients in the poisson  
model hint at some issue?

Thanking you in advance,

Costas


+++++++++++++++++++++++
POISSON - LOG LINK
+++++++++++++++++++++++


Call:
glm(formula = TotalDeadInjured[3:48] ~ -1 + Month[3:48] + sin(pi *
     Month[3:48]/6) + cos(pi * Month[3:48]/6) + sin(pi * Month[3:48]/ 
12) +
     cos(pi * Month[3:48]/12) + ThousandCars[3:48] + monthcycle[3:48] +
     TotalDeadInjured[1:46] + I((TotalDeadInjured[1:46])^2) +
     I((TotalDeadInjured[1:46])^3), family = poisson(link = log))

Deviance Residuals:
     Min       1Q   Median       3Q      Max
-3.6900  -1.1901  -0.1847   0.9477   4.3967

Coefficients:
                                 Estimate Std. Error z value Pr(>|z|)
Month[3:48]                   -7.712e-02  5.530e-03 -13.947  < 2e-16 ***
sin(pi * Month[3:48]/6)       -1.419e-01  2.759e-02  -5.144 2.68e-07 ***
cos(pi * Month[3:48]/6)       -8.407e-02  1.799e-02  -4.672 2.99e-06 ***
sin(pi * Month[3:48]/12)      -2.776e-02  1.558e-02  -1.782 0.074702 .
cos(pi * Month[3:48]/12)       5.195e-02  1.608e-02   3.232 0.001231 **
ThousandCars[3:48]             2.733e-02  2.255e-03  12.118  < 2e-16 ***
monthcycle[3:48]               6.307e-02  6.546e-03   9.635  < 2e-16 ***
TotalDeadInjured[1:46]        -2.925e-02  8.460e-03  -3.457 0.000546 ***
I((TotalDeadInjured[1:46])^2)  1.218e-04  3.613e-05   3.370 0.000750 ***
I((TotalDeadInjured[1:46])^3) -1.640e-07  4.961e-08  -3.306 0.000946 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

     Null deviance: 78694.70  on 46  degrees of freedom
Residual deviance:   130.03  on 36  degrees of freedom
AIC: 476.08

Number of Fisher Scoring iterations: 4

+++++++++++++++++++++++++
GAUSSIAN
++++++++++++++++++++++++++

Call:
glm(formula = TotalDeadInjured[3:48] ~ -1 + Month[3:48] + sin(pi *
     Month[3:48]/6) + cos(pi * Month[3:48]/6) + sin(pi * Month[3:48]/ 
12) +
     cos(pi * Month[3:48]/12) + ThousandCars[3:48] + monthcycle[3:48] +
     TotalDeadInjured[1:46] + I((TotalDeadInjured[1:46])^2) +
     I((TotalDeadInjured[1:46])^3), family = gaussian(link = identity))

Deviance Residuals:
     Min       1Q   Median       3Q      Max
-61.326  -12.012   -1.756   14.204   78.991

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)
Month[3:48]                   -8.111e+00  2.115e+00  -3.835 0.000487 ***
sin(pi * Month[3:48]/6)       -2.639e+01  1.095e+01  -2.409 0.021246 *
cos(pi * Month[3:48]/6)       -1.700e+01  7.138e+00  -2.382 0.022629 *
sin(pi * Month[3:48]/12)       2.392e-01  6.524e+00   0.037 0.970956
cos(pi * Month[3:48]/12)       8.785e+00  6.317e+00   1.391 0.172835
ThousandCars[3:48]             2.219e+00  8.604e-01   2.579 0.014146 *
monthcycle[3:48]               5.364e+00  2.494e+00   2.151 0.038301 *
TotalDeadInjured[1:46]        -4.974e+00  3.263e+00  -1.524 0.136171
I((TotalDeadInjured[1:46])^2)  2.154e-02  1.410e-02   1.527 0.135382
I((TotalDeadInjured[1:46])^3) -2.999e-05  1.959e-05  -1.530 0.134637
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 831.6357)

     Null deviance: 1927714  on 46  degrees of freedom
Residual deviance:   29939  on 36  degrees of freedom
AIC: 450.54

Number of Fisher Scoring iterations: 2




More information about the R-help mailing list