[R] nested anova<-R chrashing

Tue Sep 13 04:51:35 CEST 2011

joerg stephan <stephanj <at> rhrk.uni-kl.de> writes:

> 
> Hi,
> 
> I tried to do a nested Anova with the attached Data. My response 
> variable is "survivors" and I would like to know the effect of 
> (insect-egg clutch) "size", "position" (of clutch on twig) and "clone" 
> (/plant genotype) on the survival of eggs (due to predation). Each plant 
> was provided with three different sizes of clutches (45,15,5) and had 
> pseudo-replications of size 15 and 5 on it.

This may not be what you wanted at all, but here is my take on
what you could try with these data.

You have binomial data with lots of zeros and ones (i.e. 0 or 100%
survival), so they are unlikely to be transformable to normality
however you try.  You *might* be able to get away with modeling
these as normally distributed and then rely on permutation tests
to get your significance levels right, but (tricky as it is) I
actually think GLMMs (Zuur et al 2009, Bolker et al 2009) are
the best way to go ...

## read in the data
x <- read.table("aovmisc.dat",header=TRUE)

## take a look
summary(x)
## with(x,table(clone,as.factor(size),as.factor(position)))

## make categorical versions of size & plant variables,
##  and reconstitute the "total number surviving" variable
x <- transform(x,
               fsize=factor(size),
               fplant=factor(plant),
               tsurv=round(survivors*size)
               )

library(ggplot2)

## trying to plot everything ...
zspace <- opts(panel.margin=unit(0,"lines"))  ## squash panels together
theme_update(theme_bw())                      ## white background

## plot all data: survival vs position, with all plants represented;
##  separate subplots for each clone
## overlay GLM fit
ggplot(x,aes(x=position,y=survivors,size=size,colour=fplant))+
   geom_point(alpha=0.5)+
  facet_wrap(~clone)+xlim(0,15)+geom_line(size=0.5,alpha=0.2)+zspace+
  geom_smooth(aes(group=1,weight=size),
              colour="black",method="glm",family="binomial")

## it looks like there is a positive effect of position, and also
##  a positive effect of size (large bubbles toward high-survival)

## there are a few outliers in 'position' -- are these typos?
subset(x,position>15)

## focus on size and clone instead, suppress position:

ggplot(x,aes(x=fsize,y=survivors,colour=clone))+stat_sum(aes(size=..n..))+
  facet_grid(.~clone)+
  geom_smooth(method="glm",family="binomial",aes(weight=size,
                             x=as.numeric(fsize)))
## OR
## +  stat_summary(fun.y="mean",geom="line",aes(x=as.numeric(fsize)))+zspace

library(mgcv)
ggplot(x,aes(x=position,y=survivors,colour=clone))+stat_sum(aes(size=..n..))+
  facet_grid(fsize~clone)+
  geom_smooth(method="gam",family="binomial")+xlim(0,15)+zspace

## appears to be an effect of size, and a clone*size interaction.
## possible bimodal distribution (0/1) at size=5?

with(x,table(clone,plant)) ## each plant is only in one clone 
## (explicit nesting)
library(lme4)

## fit with size*clone interaction, main effect of position
## no need to nest plant in clone because they are explicitly nested
##  (i.e. unique plant IDs)
g1 <- glmer(cbind(tsurv,size)~fsize*clone+position+(1|fplant),
            data=x,
            family="binomial")

## doesn't LOOK like overdispersion ...
sum(residuals(g1)^2)  ## >> residual df
nrow(x)-length(fixef(g1))
pchisq(778,df=537)

## ... but try fitting observation-level random effect anyway
x <- transform(x,obs=seq(nrow(x)))
g2 <- update(g1,.~.+(1|obs))
summary(g2)
## among-observation variance 2x among-plant variance
anova(g1,g2)  ## looks much better (10 AIC units lower / p <<< 0.05

## 
drop1(g2,test="Chisq")
## looks like we can drop the size*clone interaction?

g3 <- update(g2,.~.-clone:fsize)
drop1(g3,test="Chisq")
## strong effects of size, position;
## weak effects of clone
fixef(g3)

>