[R] predicting waste per capita - is a gaussian model correct?
John C Frain
|r@|nj @end|ng |rom gm@||@com
Sun May 10 23:44:17 CEST 2020
On Sun, 10 May 2020 at 02:00, Alessandra Bielli <bielli.alessandra using gmail.com>
wrote:
> Dear list,
>
> I am new to this list and I hope it is ok to post here even though I
> already posted this question on Cross Validated.
>
> I am trying to predict the daily amount of waste per person produced in the
> fishery sector. We surveyed fishing boats at the end of their fishing trip
> and the variables I have are duration of trip (days), number of fishers,
> waste category and waste weight (g), boat ID.
>
> For each fishing trip I calculated grams of waste per person per day, i.e.
> daily waste per capita. To predict daily waste per capita, I am using a
> gaussian mixed effect model with log(waste per capita) as response variable
> (I transformed it cause it was not normally distributed - and I'm not sure
> it's correct to do so). Explanatory variable is waste category and boat ID
> is a random effect. I use the predict function to estimate daily waste per
> capita for each category and then back transformed it with exp(...).
>
> My question is: is it correct to transform daily weight per capita to fit a
> gaussian model?
>
> Thanks so much for your advice!
>
> Alessandra
>
There is no requirement that the dependent variable in a "regression" type
estimation follows a gaussian distribution. You need a model of the
process and then use an estimation technique to estimate your model. If
effects in your model are additive do not use a log transformation. If
effects are multiplicative then use a log transformation. The choice
should be determined by the mechanics of the problem and not by the
statistics. If you do use a log transformation the trying to reverse the
process using an exponential transformation will be biased. The extent of
that bias depends on your problem and it would not be possible to estimate
the significance of the bias without a much greater knowledge of the
process and data. I would suggest that you consult a competent
statistician.
John C Frain
3 Aranleigh Park
Rathfarnham
Dublin 14
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
mailto:frainj using tcd.ie
mailto:frainj using gmail.com
[[alternative HTML version deleted]]
More information about the R-help
mailing list