[R] Problem With Model.Tables Function
Gary Whysong
gwhysong at cactus.east.asu.edu
Sat Mar 10 20:44:26 CET 2001
I am using R for the first time in one of my classes. My students have
alerted me to a problem for which we have not found an answer. We find
that some means returned by the model.tables function are not correct when
missing data is present in analysis of variance problems. We have
duplicated the problem using R 1.2.0, 1.2.1, and 1.2.2 under Windows 98
and several distributions of Linux (Redhat 7.0, Mandrake 7.2, SuSE 7.0,
and 7.1).
The situation is best illustrated with a small example of a randomized
block design having three treatments and four blocks.
> blocks<-factor(c(1,2,3,4,1,2,3,4,1,2,3,4))
> trtmnts<-factor(c(1,1,1,1,2,2,2,2,3,3,3,3))
> data<-c(10,12,9,11,13,15,11,16,18,22,17,19)
> balanced<-aov(data~blocks+trtmnts)
> summary(balanced)
Df Sum Sq Mean Sq F value Pr(>F)
blocks 3 28.250 9.417 10.273 0.008868 **
trtmnts 2 147.167 73.583 80.273 4.676e-05 ***
Residuals 6 5.500 0.917
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
> model.tables(balanced,"means")
Tables of means
Grand mean
14.41667
blocks
1 2 3 4
13.667 16.333 12.333 15.333
trtmnts
1 2 3
10.50 13.75 19.00
Entering the data again and dropping treatment 2, block3 and treatment 3,
block 4, we have:
> blocks2<-factor(c(1,2,3,4,1,2,4,1,2,3))
> trtmts2<-factor(c(1,1,1,1,2,2,2,3,3,3,))
> data2<-c(10,12,9,11,13,15,16,18,22,17)
> unbalanced<-aov(data2~blocks2+trtmts2)
> summary(unbalanced)
Df Sum Sq Mean Sq F value Pr(>F)
blocks2 3 18.267 6.089 7.4341 0.0410993 *
trtmts2 2 126.557 63.279 77.2587 0.0006367 ***
Residuals 4 3.276 0.819
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
> model.tables(unbalanced,"means")
Tables of means
Grand mean
14.3
blocks2
1 2 3 4
13.67 16.33 13 13.5
rep 3.00 3.00 2 2.0
trtmts2
1 2 3
10.68 14.47 18.97
rep 4.00 3.00 3.00
We find that the treatment means (trtmts2) are incorrect although the
number of replications indicated are correct. Block means (blocks2) are
correct.
The treatment means should be: 10.5, 14.67, and 19.0, respectively.
Further investigation reveals that we encounter this problem whenever
dealing with unequal replications or missing data. For example, with
unequal subsamples, or missing data in factorial experiments. We can get
the correct means by using regression techniques (lm) to solve the
analysis of variance problems and extracting the fitted values from the
appropriate lm model.
Since I am learning R, perhaps I have missed something? Is this possibly
a bug in the model.tables function?
------------------------------------------------------------
Gary Whysong, Associate Professor, Environmental Resources
Morrison School of Agribusiness & Resource Management
Arizona State University East
Phone: (480) 727-1263, E-mail: gwhysong at Cactus.east.asu.edu
------------------------------------------------------------
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list