[R] Example of cell means model
Francisco Vergara
gerifalte28 at hotmail.com
Wed Oct 15 20:17:03 CEST 2003
Thanks a lot for your reply. This helps a lot!
Just to confirm, using lm this model will give me the mean yield value for
each cell in the two way array. Now if I want to obtain the mean
of group means (like a SS type III approach) using LME (since I have random
effects in the model) how can I parametrize this?
I could definitivelly use xtabs in a two-way case but in my case I have 2
other (continuous) covariates that are potential confounders in
the model so I need to keep them to obtain the corrected means.
I added a continuous variable (NewVar) to the dataset Newxmp11.07 and
obtaned a model with the covariate (fm4) and another without it (fm3)
>Newxmp11.07<-fix(xmp11.07)
>str(Newxmp11.07)
`data.frame': 36 obs. of 4 variables:
$ Yield : num 10.5 9.2 7.9 12.8 11.2 13.3 12.1 12.6 14 10.8 ...
$ Variety: Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
$ Density: Factor w/ 4 levels "1","2","3","4": 1 1 1 2 2 2 3 3 3 4 ...
$ NewVar : num 10 9 7 12 11 11 12 11 15 16 ...
>fm3<-gls(Yield~-1+Variety + Density, xmp11.07)
>fm4<-gls(Yield~-1+Variety + Density+NewVar, Newxmp11.07)
>fm3
Coefficients:
Variety1 Variety2 Variety3 Density2 Density3 Density4
8.922222 9.797222 15.713889 2.911111 4.300000 2.433333
Degrees of freedom: 36 total; 30 residual
Residual standard error: 1.239243
>fm4
Coefficients:
Variety1 Variety2 Variety3 Density2 Density3 Density4
8.75757265 9.70589316 15.43347009 2.88152564 4.32186752 2.40117521
NewVar
0.01157692
Degrees of freedom: 36 total; 29 residual
Residual standard error: 1.252991
fm4 gives me the mean of the group means for all the varieties but
apparently it gives me the treatment contrasts for the densities. If I
change the order of the factors in the model specification I get
>coef(fm5<-gls(Yield~-1+ Density+Variety+NewVar, Newxmp11.07))
Density1 Density2 Density3 Density4 Variety2 Variety3
8.75757265 11.63909829 13.07944017 11.15874786 0.94832051 6.67589744
NewVar
0.01157692
This, just like fm4 will include the original intercept value in Density 1
which is not the actual density 1 mean. What am I missing? I am sorry if
these questions are very basic but I want to make sure that I understand
what I am doing. I guess that this is the price that I am paying for having
used in the past packages like SAS where you just ask for lsmeans and the
software will give you a "black box" answer!
Best regards
Francisco
>From: Douglas Bates <bates at stat.wisc.edu>
>To: "Francisco Vergara" <gerifalte28 at hotmail.com>
>CC: r-help at r-project.org
>Subject: Re: [R] Example of cell means model
>Date: 15 Oct 2003 08:40:33 -0500
>
>This is an example from chapter 11 of the 6th edition of Devore's
>engineering statistics text. It happens to be a balanced data set in
>two factors but the calculations will also work for unbalanced data.
>
>I create a factor called 'cell' from the text representation of the
>Variety level and the Density level using '/' as the separator
>character. The coefficients for the linear model 'Yield ~ cell - 1'
>are exactly the cell means.
>
> > library(Devore6)
> > data(xmp11.07)
> > str(xmp11.07)
>`data.frame': 36 obs. of 3 variables:
> $ Yield : num 10.5 9.2 7.9 12.8 11.2 13.3 12.1 12.6 14 10.8 ...
> $ Variety: Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
> $ Density: Factor w/ 4 levels "1","2","3","4": 1 1 1 2 2 2 3 3 3 4 ...
> > xmp11.07$cell = with(xmp11.07, factor(paste(Variety, Density, sep =
>'/')))
> > xtabs(~ Variety + Density, xmp11.07)
> Density
>Variety 1 2 3 4
> 1 3 3 3 3
> 2 3 3 3 3
> 3 3 3 3 3
> > means = xtabs(Yield ~ Variety+Density, xmp11.07)/xtabs(~ Variety +
>Density, xmp11.07)
> > means
> Density
>Variety 1 2 3 4
> 1 9.200000 12.433333 12.900000 10.800000
> 2 8.933333 12.633333 14.500000 12.766667
> 3 16.300000 18.100000 19.933333 18.166667
> > coef(fm1 <- lm(Yield ~ cell - 1, xmp11.07))
> cell1/1 cell1/2 cell1/3 cell1/4 cell2/1 cell2/2 cell2/3
>cell2/4
> 9.200000 12.433333 12.900000 10.800000 8.933333 12.633333 14.500000
>12.766667
> cell3/1 cell3/2 cell3/3 cell3/4
>16.300000 18.100000 19.933333 18.166667
>
>The residual sum of squares for this model is the same as that for the
>model with crossed terms
>
> > deviance(fm1)
>[1] 38.04
> > deviance(fm2 <- lm(Yield ~ Variety * Density, xmp11.07))
>[1] 38.04
>
>but the coefficients are quite different because they represent a
>different parameterization.
>
> > coef(fm2)
> (Intercept) Variety2 Variety3 Density2
> 9.20000000 -0.26666667 7.10000000 3.23333333
> Density3 Density4 Variety2:Density2 Variety3:Density2
> 3.70000000 1.60000000 0.46666667 -1.43333333
>Variety2:Density3 Variety3:Density3 Variety2:Density4 Variety3:Density4
> 1.86666667 -0.06666667 2.23333333 0.26666667
>
>I hope this answers your question. Sorry for the delay in
>responding to you.
>
>"Francisco Vergara" <gerifalte28 at hotmail.com> writes:
>
> > Many thanks for your reply. An example would be very helpful to make
> > sure that I understand correctly what you described.
_________________________________________________________________
Surf and talk on the phone at the same time with broadband Internet access.
Get high-speed for as low as $29.95/month (depending on the local service
providers in your area). https://broadband.msn.com
More information about the R-help
mailing list