[R] glm or transformation of the response?
Rolf Turner
rolf.turner at xtra.co.nz
Sun Jan 8 00:08:03 CET 2012
On 08/01/12 05:54, emily wrote:
> Hi Dr. Snow,
This is the r-help mailing list, not Greg Snow's private email. If
you just want to email Dr. Snow, then email *him* (his address was
given in the post to which you replied).
<SNIP>
> I am not using R at the moment (working in SPSS, have to love the GUI)
I can only feel pity for you.
> but my question is quite related:
>
> I am running a generalized linear model on data highly skewed to the right
> with a bunch of zeroes, so I decided to use the Tweedie distribution. In the
> model I ran both untransformed data (with link=log) as well as log(x+1)
> transformed data (with link=identity). The latter model had a much smaller
> (more negative) AICc value than the untransformed data with link=log.
>
> Is it valid to run the GLM with log(x+1) transformed data if link=identity?
> Or am I violating some kind of assumption about the model?
You are simply fitting two very different models.
(1) Tweedie distribution, log link:
E(Y) = exp(beta_0 + beta_1 * x), Y has a Tweedie distribution
(2) Log transformation, identity link:
V = log(Y + 1)
E(V) = beta_0 + beta_1 * x, V has a ??? (Tweedie???)
distribution.
E(Y) = E(exp(V))
You know E(V) but you don't know E(exp(V)) --- and cannot
readily calculate it
from E(V). So this second model may not be of much use to you
--- depending
of course on what use you are actually trying to make of it.
If Y has a Tweedie distribution (I've only heard of these; don't know
anything about
them; I believe they can be complicated) then it seems to me unlikely
that log(Y+1)
will also have one. You need to decide if you know something about the
distribution
of Y or if you know something about the distribution of log(Y+1).
To quote from the signature file of someone who posts to this list,
``What problem
are you trying to solve?''
<SNIP>
cheers,
Rolf Turner
More information about the R-help
mailing list