[R-sig-eco] log link versus log response

Simon Blomberg s.blomberg1 at uq.edu.au
Thu Apr 24 05:47:07 CEST 2008


In a), we have

log(mu_a) = t(X) %*% beta, Y_i ~ N(mu_a, sigma^2)

ie we are modelling mu_a in terms of explanatory variables X and
parameters beta, and the link function operates on mu_a. mu_a is
estimated by mean(Y_i).

in b) we have

mu_b = t(X) %*% beta, log(Y_i) ~ N(mu_b, sigma^2)

Now, mean(Y_i) estimates mu_a, and mean(log(Y_i) ) estimates mu_b, but
clearly mu_a != mu_b because mean(log(x)) != log(mean(x))

So they are different models entirely. Comparing these models is
slightly tricky, because taking log(Y_i) means that you need to use the
change of variable formula to make the likelihood in b) comparable to
the likelihood of a). You can't just compare AIC's or the deviances for
example.

hope this helps,

Simon.

where mu_i is some function of On Thu, 2008-04-24 at 13:38 +1200, Tomas
Easdale wrote:
> Hi there,
>  
> I am using glms. Could someone please explain what's the difference
> between (a) using a gaussian family distribution with a LOG link
> function and (b) LOG transforming the response variable with a normal
> distribution (Gaussian family distribution with identity link function).
> The outputs differ and clearly one option or the other will result in
> better fits depending on the dataset (everything else equal) but I want
> to understand why is this so.
>  
> Thanks in advance,
>  
> Toms Easdale
> Landcare Research, NZ
>      
>   
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
-- 
Simon Blomberg, BSc (Hons), PhD, MAppStat. 
Lecturer and Consultant Statistician 
Faculty of Biological and Chemical Sciences 
The University of Queensland 
St. Lucia Queensland 4072 
Australia
Room 320 Goddard Building (8)
T: +61 7 3365 2506
http://www.uq.edu.au/~uqsblomb
email: S.Blomberg1_at_uq.edu.au

Policies:
1.  I will NOT analyse your data for you.
2.  Your deadline is your problem.

The combination of some data and an aching desire for 
an answer does not ensure that a reasonable answer can 
be extracted from a given body of data. - John Tukey.



More information about the R-sig-ecology mailing list