[R] comparing AIC values of models with transformed, untransformed, and weighted variables
Patrick Baker
patrick.baker at sci.monash.edu.au
Wed Mar 15 04:28:45 CET 2006
Hi there, I have a question regarding model comparisons that seems
simple enough but to which I cannot find an answer. I am interested in
developing a predictive model relating some measure of a tree's stem to
the total leaf area (TLA) of the tree. Predictor variables might
include, for example, the total cross-sectional area of the tree
(commonly referred to as basal area) or the amount of sapwood area (SA)
(which represents the amount of wood involved in active transport of
water up the tree to the leaves). A variety of people have developed
these models for a variety of tree species in a variety of places around
the world. Perhaps not surprisingly, different studies have used
different model forms in analyzing their data. I am interested in
comparing the range of models that have been previously used (some of
which are theoretically derived, others of which are empirically driven)
using a data set that I have collected (for yet another species in yet
another place). To compare the different model forms I had intended to
use the AIC. However, I have found, again perhaps not surprisingly, that
when I use log-transformed data, the AIC is substantially lower for a
given predictor variable. If I use a weighted glm the same issue arises.
For example, using BA vs TLA the (rounded) AIC values are 275 for a
linear model, 30 for a log-log model, and 8 for a glm weighted by 1/BA.
I don't believe that these vast differences reflect a major improvement
in the model, but rather the scaling of the variables by transformation
or weighting. What I'd like to get some advice or insight on is whether
there is an appropriate way to rescale the AIC values to permit
comparisons across these models. Any suggestions would be very welcome.
Cheers, Patrick Baker
More information about the R-help
mailing list