[R] glm or transformation of the response?

Rolf Turner rolf.turner at xtra.co.nz
Sun Jan 8 00:08:03 CET 2012


On 08/01/12 05:54, emily wrote:
> Hi Dr. Snow,

This is the r-help mailing list, not Greg Snow's private email.  If
you just want to email Dr.  Snow, then email *him* (his address was
given in the post to which you replied).

<SNIP>
> I am not using R at the moment (working in SPSS, have to love the GUI)

     I can only feel pity for you.

> but my question is quite related:
>
> I am running a generalized linear model on data highly skewed to the right
> with a bunch of zeroes, so I decided to use the Tweedie distribution. In the
> model I ran both untransformed data (with link=log) as well as log(x+1)
> transformed data (with link=identity). The latter model had a much smaller
> (more negative) AICc value than the untransformed data with link=log.
>
> Is it valid to run the GLM with log(x+1) transformed data if link=identity?
> Or am I violating some kind of assumption about the model?

You are simply fitting two very different models.

     (1) Tweedie distribution, log link:

         E(Y) = exp(beta_0 + beta_1 * x),  Y has a Tweedie distribution

     (2) Log transformation, identity link:

         V = log(Y + 1)

         E(V) = beta_0 + beta_1 * x,   V has a ??? (Tweedie???) 
distribution.

         E(Y) = E(exp(V))

         You know E(V) but you don't know E(exp(V)) --- and cannot 
readily calculate it
         from E(V).  So this second model may not be of much use to you 
--- depending
         of course on what use you are actually trying to make of it.

If Y has a Tweedie distribution (I've only heard of these; don't know 
anything about
them; I believe they can be complicated) then it seems to me unlikely 
that log(Y+1)
will also have one.  You need to decide if you know something about the 
distribution
of Y or if you know something about the distribution of log(Y+1).

To quote from the signature file of someone who posts to this list, 
``What problem
are you trying to solve?''

<SNIP>

     cheers,

         Rolf Turner



More information about the R-help mailing list