[R] Log-transform and specifying Gamma

Ben Bolker bbolker at gmail.com
Fri Nov 18 05:03:19 CET 2011


Peter Minting <peter_minting <at> hotmail.com> writes:

> 
> 
> Dear R help,
> I am trying to work out if I am justified in
> log-transforming data and specifying Gamma in the same glm.
> Does it have to be one or the other?

 No, but I've never seen it done. 

> I have attached an R script and the datafile to show what I mean.
> Also, I cannot find a mixed-model that allows Gamma errors
> (so I cannot find a way of including random effects).
> What should I do?
> Many thanks,
> Pete
>   		 	   		  
> 

ToadsBd<-read.table("Bd.txt",header=TRUE,
   colClasses=c(rep("factor",2),rep("numeric",3),"factor"))

with(ToadsBd,table(group,site))
## 47 toads, 3 groups per site
library(ggplot2)
library(mgcv)

## plot points, add linear regressions per group/site

ggplot(ToadsBd,aes(x=startg,y=logBd,colour=group))+
  geom_point()+facet_grid(.~site)+geom_smooth(method="lm")

## not much going on with startg ...  PERHAPS
## similar slopes across sites?

ggplot(ToadsBd,aes(x=site:group,y=logBd,colour=site))+
  geom_boxplot()+geom_point()

## I'm curious -- I thought the groups were just blocking
## factors, but maybe not? The patterns of group 1, 2, 3
## are consistent across sites ...

## take a quick look at the raw data ...
ggplot(ToadsBd,aes(x=site:group,y=Bd,colour=site))+
  geom_boxplot()+geom_point()

mod1 <- lm(logBd~group*site*startg,data=ToadsBd)
summary(mod1)

oldpar <- par(mfrow=c(2,2))
plot(mod1)
par(oldpar)

## we definitely have to take care of the heteroscedasticity ...

library(MASS)
boxcox(mod1)

## square root transform ... ??
ToadsBd <- transform(ToadsBd,sqrtLogBd=sqrt(logBd))
mod2 <- lm(sqrtLogBd~group*site*startg,data=ToadsBd)

oldpar <- par(mfrow=c(2,2))
plot(mod2)
par(oldpar)

## still not perfect, but perhaps OK
library(coefplot2)
coefplot2(mod2)

mod3 <- update(mod2,.~.-group:site:startg)
coefplot2(mod3)
drop1(mod3,test="F")
mod4 <- update(mod3,.~group+site+startg)
coefplot2(mod4)

## look at results on new (transformed) scale
ggplot(ToadsBd,aes(x=site:group,y=sqrtLogBd,colour=site))+
  geom_boxplot()+geom_point()

## Conclusions:
##  don't mess around with random effects for only three groups in two sites
##  I have done a fair amount of stepwise selection, so the p-values
##     really can't be taken seriously, but it was clear from the
##    beginning that there were differences among groups, which *seem*
##    to be consistent among sites (which really makes me wonder what
##    the groups are.  (The weak effect of site might well go away
##    once one took the effect of snooping into account ...)
## sqrt(log(x)) seems to be adequate to get reasonably
##   homogeneous variances, but it really is a very strong transformation.
##   It makes the results somewhat hard to interpret.  Alternatively,
##    you could just look at a nonparametric test (e.g. Kruskal-Wallis
##    on site:group), but nonparametric tests make it hard to
##    add lots of structure to the model



More information about the R-help mailing list