[R] anova in unbalanced data

S Ellison S.Ellison at LGCGroup.com
Tue Aug 14 13:30:01 CEST 2012


> -----Original Message-----
> Say I have the following data:
> a<-data.frame(col1=c(rep("a",5),rep("b",7)),col2=runif(12))
> a_aov<-aov(a$col2~a$col1)
> summary(aov)
> Note that there are 5 observations for a and 7 for b, thus is 
> unbalanced. What would be the correct way of doing anova for this set?

As this is a single factor case, the imbalance doesn't affect the interpretation. For two-way and higher models, it would affect the interpretation, and john fox's post (and a very large literature) then applies. But here, the usual variants and contrast choices will all return the same p-value, so aov works, as does 
anova(lm(col2~col1, data=a)) #note that the 'data' argument also works in aov


This email and any attachments are confidential. Any use...{{dropped:8}}

More information about the R-help mailing list