[R] comparing AIC values of models with transformed, untransformed, and weighted variables

Wed Mar 15 04:28:45 CET 2006

Hi there, I have a question regarding model comparisons that seems 
simple enough but to which I cannot find an answer. I am interested in 
developing a predictive model relating some measure of a tree's stem to 
the total leaf area (TLA) of the tree. Predictor variables might 
include, for example, the total cross-sectional area of the tree 
(commonly referred to as basal area) or the amount of sapwood area (SA) 
(which represents the amount of wood involved in active transport of 
water up the tree to the leaves). A variety of people have developed 
these models for a variety of tree species in a variety of places around 
the world. Perhaps not surprisingly, different studies have used 
different model forms in analyzing their data. I am interested in 
comparing the range of models that have been previously used (some of 
which are theoretically derived, others of which are empirically driven) 
using a data set that I have collected (for yet another species in yet 
another place). To compare the different model forms I had intended to 
use the AIC. However, I have found, again perhaps not surprisingly, that 
when I use log-transformed data, the AIC is substantially lower for a 
given predictor variable. If I use a weighted glm the same issue arises. 
For example, using BA vs TLA the (rounded) AIC values are  275 for a 
linear model, 30 for a log-log model, and 8 for a glm weighted by 1/BA. 
I don't believe that these vast differences reflect a major improvement 
in the model, but rather the scaling of the variables by transformation 
or weighting. What I'd like to get some advice or insight on is whether 
there is an appropriate way to rescale the AIC values to permit  
comparisons across these models. Any suggestions would be very welcome. 
Cheers, Patrick Baker