[R] Unbalanced Anova: What is the best approach?

John Fox jfox at mcmaster.ca
Sun Apr 3 15:28:29 CEST 2011

Dear Krishna,

Although it's difficult to explain briefly, I'd argue that balanced and
unbalanced ANOVA are not fundamentally different, in that the focus should
be on the hypotheses that are tested, and these are naturally expressed as
functions of cell means and marginal means. For example, in a two-way ANOVA,
the null hypotheses of no interaction is equivalent to parallel profiles of
cell means for one factor across levels of the other. What is different,
though, is that in a balanced ANOVA all common approaches to constructing an
ANOVA table coincide.

Without getting into the explanation in detail (which you can find in a text
like my Applied Regression Analysis and Generalized Linear Models),
so-called type-I (or sequential) tests, such as those performed by the
standard anova() function in R, test hypotheses that are rarely of
substantive interest, and, even when they are, are of interest only by
accident. So-called type-II tests, such as those performed by default by the
Anova() function in the car package, test hypotheses that are almost always
of interest. Type-III tests, which the Anova() function in car can perform
optionally, require careful formulation of the model for the hypotheses
tested to be sensible, and even then have less power than corresponding
type-II tests in the circumstances in which a test would be of interest.

Since you're addressing fixed-effects models, I'm not sure why you
introduced nlme and lme4 into the discussion, but I note that Anova() in the
car package has methods that can produce type-II and -III Wald tests for the
fixed effects in mixed models fit by lme() and lmer().

Your question has been asked several times before on the r-help list. For
example, if you enter terms like "type-II" or "unbalanced ANOVA" in the
RSeek search engine and look under the "Support Lists" tab, you'll see many
hits -- e.g.,

I hope this helps,

John Fox
Senator William McMaster
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Krishna Kirti Das
> Sent: April-03-11 3:25 AM
> To: r-help at r-project.org
> Subject: [R] Unbalanced Anova: What is the best approach?
> I have a three-way unbalanced ANOVA that I need to calculate (fixed
> effects plus interactions, no random effects). But word has it that aov()
> is good only for balanced designs. I have seen a number of different
> recommendations for working with unbalanced designs, but they seem to
> differ widely (car, nlme, lme4, etc.). So I would like to know what is the
> best or most usual way to go about working with unbalanced designs and
> extracting a reliable ANOVA table from them in R?
> 	[[alternative HTML version deleted]]
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list