[R-sig-eco] Log transformed simple linear regression and Poisson regression

Li Wen Li.Wen at environment.nsw.gov.au
Thu Aug 2 01:53:45 CEST 2012


HI, Lin

You might "normalize" your predictors before the GLM, this allow you to directly compare the contributions by looking at the fitted coefficients, as all your Xs have 0 mean and 1 standard deviation.  

Li

-----Original Message-----
From: r-sig-ecology-bounces at r-project.org [mailto:r-sig-ecology-bounces at r-project.org] On Behalf Of lgj200306
Sent: Thursday, 2 August 2012 8:50 AM
To: Liz Pryde
Cc: r-sig-ecology at r-project.org
Subject: Re: [R-sig-eco] Log transformed simple linear regression and Poisson regression

Thanks Liz, Brian and Mollie. Your replies are helpful for me. 
I have realized the complexity of analysing the count data. Different methods should be selected according to the structure of my data. But if the explanatory data table is composed by several  variables and I want to make clear the relative contribution of each explanatory variable to the variation of response variable, can I achieve it based on the Poisson, negative bionmial, zero-inflated Poisson, zero-inflated negative binomial or other models (except the simple linear regression model)?
Thanks for your attention and best wishes for you!
 
Lin
Aug 1st, 2012

At 2012-08-02 06:17:03,"Liz Pryde" <elizabethpryde at gmail.com> wrote:
>Hello Lin,
>
>It's a bit difficult to give you an exact answer without knowing what the data set is.
>That said, in general, count data does usually follow a poisson or neg binomial distribution. When you log transform such data, you are trying to normalise its distribution so that you can apply OLS regression (which assumes a normal distribution for the Y data to give you 'correct' significance values). One of the problems with this is that when you have data that can't be negative (e.g. counts) OLS regression on count data can give you estimations of your predicted values that may be negative (and hence, nonsensical). Also, when mean values are very low (near zero or zero), transforming the data becomes ineffective. GLMs overcome this problem because they explicitly model the X-Y relationship through the link function and ALSO model the mean-variance relationship (which is what is happening when you choose 'family= '). So they 'normalise' and 'linearise' the data as well as 'model' the error. Poisson data will generally have a variance that increases with increasing mean.!
  On occasion this variance may be higher than expected (overdispersed) and so the negative binomial becomes appropriate.
>
>I hope this helps.
>The O'Hara paper that was mentioned below gives an excellent explanation of this.
>
>Liz
>
>
>On 02/08/2012, at 3:42 AM, lgj200306 <lgj200306 at 163.com> wrote:
>
>> Hi, everyone,
>> I used two methods to analysis the relationship between y (count data) and x.
>> 1) log transformed simple linear regression:
>>> lm(log(y+1)~x1+x2+x3,data)
>> 2) Poisson regression:
>>> glm(y~x1+x2+x3,family=poisson())
>> 
>> Some one told that these two ways are very similar, but other one told that the Poisson regression does not fit the y but the lambda (parameter of the poisson distribution). I am not sure which one is right. Can anybody help me? 
>> Thanks for your attention and best wishes for you!
>> 
>> Lin
>> Aug 1st, 2012 
>> 	[[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>Liz Pryde
>PhD Candidate (off-campus @ The University of Melbourne)
>School of Earth and Environmental Sciences
>James Cook University, QLD
>
>elizabethpryde at gmail.com
>epryde at unimelb.edu.au
>
>

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
This email is intended for the addressee(s) named and may contain confidential and/or privileged information. 
If you are not the intended recipient, please notify the sender and then delete it immediately.
Any views expressed in this email are those of the individual sender except where the sender expressly and with authority states them to be the views of the Office of Environment and Heritage, NSW Department of Premier and Cabinet.

PLEASE CONSIDER THE ENVIRONMENT BEFORE PRINTING THIS EMAIL



More information about the R-sig-ecology mailing list