[R] rms:fastbw variable selection differences with AIC .vs. p value methods

Rob James aetiologic at gmail.com
Fri Aug 19 21:04:44 CEST 2011


I want to employ a parsimonious model to draw nomograms, as the full 
model is too complex to draw nomograms readily (several interactions of 
continuous variables).  However, one interesting variable stays or 
leaves based on whether I choose "p value" or "AIC" options to 
fastbw().  My question boils down to this: Is there a theoretical reason 
to prefer one over another?


Consider:

fastbw(model94c, aic=1e10)

  Deleted            Chi-Sq d.f. P      Residual d.f. P      AIC
  ToD                  0.11  3   0.9903    0.11   3   0.9903   -5.89
  Experience * ToD     2.56  3   0.4646    2.67   6   0.8487   -9.33
  Experience * Assoc   0.45  2   0.7970    3.13   8   0.9262  -12.87
  RatePressure         2.99  3   0.3939    6.11  11   0.8658  -15.89
  DW_height_t          2.92  3   0.4047    9.03  14   0.8293  -18.97
  TBV * Experience     3.46  3   0.3260   12.49  17   0.7698  -21.51
  Experience * Sex     0.05  1   0.8153   12.54  18   0.8181  -23.46
  Experience           0.18  1   0.672    12.72  19   0.8526  -25.28
  Sex                  1.19  1   0.2745   13.91  20   0.8348  -26.09
  Assoc                6.09  2   0.0475   20.01  22   0.5826  -23.99
  Experience * Pulse  10.53  3   0.0146   30.53  25   0.2049  -19.47
  Sex * ToD           18.24  3   0.0004   48.77  28   0.0088   -7.23
  PulsePressure       21.15  3   0.0001   69.92  31   0.0001    7.92
  Race                19.87  2   0.0000   89.79  33   0.0000   23.79
  Pulse               25.31  3   0.0000  115.09  36   0.0000   43.09
  Age * Experience   202.80  3   0.0000  317.89  39   0.0000  239.89
  TBV                282.41  3   0.0000  600.30  42   0.0000  516.30
  Location           310.19 14   0.0000  910.50  56   0.0000  798.50
  Age                809.64  3   0.0000 1720.13  59   0.0000 1602.13



The ordering of variables is expected, and is consistent with the 
substantial knowledge I have about the outcome.

The problematic variable is Sex * TOD .  When I use p value as the rule, 
with an SLS of 0.01, the variable is retained, but when I use AIC , 
Sex*TOD is not retained. This reflects the fact that while the Sex*TOD 
interaction is theoretically interesting, the AIC value is negative and 
relatively small in magnitude, even as the p value skirts below 0.01.   
Is this judgement territory or are their statistical considerations that 
should be invoked? Caveats?



Is there a theoretical reason to choose AIC over p value methods, or is 
either acceptable?



More information about the R-help mailing list