[R] Input file format to Anova from car package

Sun Nov 22 05:03:00 CET 2009

Marcelo Laia wrote:
> Dear list member,
> 
> My question is related to input file format to an Anova from car package.
> 
> Here is an example of what I did:
> 
> My file format is like this (and I dislike the idea that I will need
> to recode it):
> 
> Hormone day Block Treatment Plant Diameter High N.Leaves
> SH 23 1 1 1 3.19 25.3 2
> SH 23 1 1 2 3.42 5.5 1
> SH 23 1 2 1 2.19 5.2 2
> SH 23 1 2 2 2.17 7.6 2
> CH 23 1 1 1 3.64 6.5 2
> CH 23 1 1 2 2.8 3.7 2
> CH 23 1 2 1 3.28 4 2
> CH 23 1 2 2 2.82 5.2 2
> SH 23 2 1 1 2.87 6.4 2
> SH 23 2 1 2 2.8 6 2
> SH 23 2 2 1 2.02 4.5 2
> SH 23 2 2 2 3.15 5.5 2
> CH 23 2 1 1 3.22 2.3 2
> CH 23 2 1 2 2.45 3.8 2
> CH 23 2 2 1 1.85 3.5 2
> CH 23 2 2 2 3.13 4.4 2
> CH 39 1 1 1 2.64 6 2
> CH 39 1 1 2 4.33 10 2
> CH 39 1 2 1 3.74 9 2
> CH 39 1 2 2 3.23 8 2
> SH 39 1 1 1 3.8 8 2
> SH 39 1 1 2 2.35 9 2
> SH 39 1 2 1 3.66 6 2
> SH 39 1 2 2 3.92 7 2
> CH 39 2 1 1 3.28 7 2
> CH 39 2 1 2 4.99 7 2
> CH 39 2 2 1 2.49 6 2
> CH 39 2 2 2 4.75 7 2
> SH 39 2 1 1 3.35 5 2
> SH 39 2 1 2 4.38 7 2
> SH 39 2 2 1 5.11 9 2
> SH 39 2 2 2 2.71 5 2
> 
> idata <- data.frame(Idade=factor(c(23,39)))
> a = read.table("clipboard", sep=" ", head=T)
> mod.ok <- lm(Diameter ~  Treatment*Hormone, data=a)
> av.ok <- Anova(mod.ok, idata=idata, idesign=~as.factor(day))
> summary(av.ok)
>      Sum Sq               Df           F value            Pr(>F)
>  Min.   : 0.02153   Min.   : 1.00   Min.   :0.02828   Min.   :0.5105
>  1st Qu.: 0.06169   1st Qu.: 1.00   1st Qu.:0.06346   1st Qu.:0.6331
>  Median : 0.20667   Median : 1.00   Median :0.09863   Median :0.7558
>  Mean   : 5.43711   Mean   : 7.75   Mean   :0.19043   Mean   :0.7113
>  3rd Qu.: 5.58208   3rd Qu.: 7.75   3rd Qu.:0.27150   3rd Qu.:0.8117
>  Max.   :21.31356   Max.   :28.00   Max.   :0.44437   Max.   :0.8677
>                                     NA's   :1.00000   NA's   :1.0000
> 
> This result is wrong, I believe.
> 
It's wrong because your use of Anova is inappropriate here.
mod.ok should be an object of class "mlm" for this use of Anova.

  class(mod.ok)

or

  str(mod.ok)

would be useful.
See below for further comments.

> Here, is a file format with repeated measures side-by-side:
> 
> Hormone Block Treatment Plant Diameter.23 Diameter.39 High.23 High.39
> N.Leaves.23 N.Leaves.39
> SH 1 1 1 3.19 2.64 25.3 6 2 2
> SH 1 1 2 3.42 4.33 5.5 10 1 2
> SH 1 2 1 2.19 3.74 5.2 9 2 2
> SH 1 2 2 2.17 3.23 7.6 8 2 2
> CH 1 1 1 3.64 3.8 6.5 8 2 2
> CH 1 1 2 2.8 2.35 3.7 9 2 2
> CH 1 2 1 3.28 3.66 4 6 2 2
> CH 1 2 2 2.82 3.92 5.2 7 2 2
> SH 2 1 1 2.87 3.28 6.4 7 2 2
> SH 2 1 2 2.8 4.99 6 7 2 2
> SH 2 2 1 2.02 2.49 4.5 6 2 2
> SH 2 2 2 3.15 4.75 5.5 7 2 2
> CH 2 1 1 3.22 3.35 2.3 5 2 2
> CH 2 1 2 2.45 4.38 3.8 7 2 2
> CH 2 2 1 1.85 5.11 3.5 9 2 2
> CH 2 2 2 3.13 2.71 4.4 5 2 2
> 
> idata <- data.frame(day=factor(c(23,39)))
> a = read.table("clipboard", sep=" ", head=T)
> mod.ok <- lm(cbind(Diameter.23,Diameter.39)  ~  Treatment*Hormone, data=a)
> av.ok <- Anova(mod.ok, idata=idata, idesign= ~ as.factor(day))
> summary(av.ok)
> 
> Type II Repeated Measures MANOVA Tests:
> 
> ------------------------------------------
> 
> Term: Treatment
> 
>  Response transformation matrix:
>             (Intercept)
> Diameter.23           1
> Diameter.39           1
> 
> Sum of squares and products for the hypothesis:
>             (Intercept)
> (Intercept)   0.6765062
> 
> Sum of squares and products for error:
>             (Intercept)
> (Intercept)    13.05917
> 
> Multivariate Tests: Treatment
>                  Df test stat  approx F num Df den Df  Pr(>F)
> Pillai            1 0.0492517 0.6216377      1     12 0.44574
> Wilks             1 0.9507483 0.6216377      1     12 0.44574
> Hotelling-Lawley  1 0.0518031 0.6216377      1     12 0.44574
> Roy               1 0.0518031 0.6216377      1     12 0.44574
> 
> ------------------------------------------
> 
> Term: Hormone
> 
>  Response transformation matrix:
>             (Intercept)
> Diameter.23           1
> Diameter.39           1
> 
> Sum of squares and products for the hypothesis:
>             (Intercept)
> (Intercept)  0.09150625
> 
> Sum of squares and products for error:
>             (Intercept)
> (Intercept)    13.05917
> 
> Multivariate Tests: Hormone
>                  Df test stat   approx F num Df den Df  Pr(>F)
> Pillai            1 0.0069583 0.08408456      1     12 0.77679
> Wilks             1 0.9930417 0.08408456      1     12 0.77679
> Hotelling-Lawley  1 0.0070070 0.08408456      1     12 0.77679
> Roy               1 0.0070070 0.08408456      1     12 0.77679
> 
> ------------------------------------------
> 
> Term: Treatment:Hormone
> 
>  Response transformation matrix:
>             (Intercept)
> Diameter.23           1
> Diameter.39           1
> 
> Sum of squares and products for the hypothesis:
>             (Intercept)
> (Intercept)    1.139556
> 
> Sum of squares and products for error:
>             (Intercept)
> (Intercept)    13.05917
> 
> Multivariate Tests: Treatment:Hormone
>                  Df test stat approx F num Df den Df  Pr(>F)
> Pillai            1 0.0802576 1.047132      1     12 0.32636
> Wilks             1 0.9197424 1.047132      1     12 0.32636
> Hotelling-Lawley  1 0.0872610 1.047132      1     12 0.32636
> Roy               1 0.0872610 1.047132      1     12 0.32636
> 
> ------------------------------------------
> 
> Term: as.factor(day)
> 
>  Response transformation matrix:
>             as.factor(day)1
> Diameter.23               1
> Diameter.39              -1
> 
> Sum of squares and products for the hypothesis:
>                 as.factor(day)1
> as.factor(day)1        11.78206
> 
> Sum of squares and products for error:
>                 as.factor(day)1
> as.factor(day)1        15.41527
> 
> Multivariate Tests: as.factor(day)
>                  Df test stat approx F num Df den Df   Pr(>F)
> Pillai            1 0.4332063 9.171726      1     12 0.010496 *
> Wilks             1 0.5667937 9.171726      1     12 0.010496 *
> Hotelling-Lawley  1 0.7643105 9.171726      1     12 0.010496 *
> Roy               1 0.7643105 9.171726      1     12 0.010496 *
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> ------------------------------------------
> 
> Term: Treatment:as.factor(day)
> 
>  Response transformation matrix:
>             as.factor(day)1
> Diameter.23               1
> Diameter.39              -1
> 
> Sum of squares and products for the hypothesis:
>                 as.factor(day)1
> as.factor(day)1        1.139556
> 
> Sum of squares and products for error:
>                 as.factor(day)1
> as.factor(day)1        15.41527
> 
> Multivariate Tests: Treatment:as.factor(day)
>                  Df test stat approx F num Df den Df  Pr(>F)
> Pillai            1 0.0688353 0.887086      1     12 0.36484
> Wilks             1 0.9311647 0.887086      1     12 0.36484
> Hotelling-Lawley  1 0.0739238 0.887086      1     12 0.36484
> Roy               1 0.0739238 0.887086      1     12 0.36484
> 
> ------------------------------------------
> 
> Term: Hormone:as.factor(day)
> 
>  Response transformation matrix:
>             as.factor(day)1
> Diameter.23               1
> Diameter.39              -1
> 
> Sum of squares and products for the hypothesis:
>                 as.factor(day)1
> as.factor(day)1       0.1501563
> 
> Sum of squares and products for error:
>                 as.factor(day)1
> as.factor(day)1        15.41527
> 
> Multivariate Tests: Hormone:as.factor(day)
>                  Df test stat  approx F num Df den Df  Pr(>F)
> Pillai            1 0.0096468 0.1168889      1     12 0.73835
> Wilks             1 0.9903532 0.1168889      1     12 0.73835
> Hotelling-Lawley  1 0.0097407 0.1168889      1     12 0.73835
> Roy               1 0.0097407 0.1168889      1     12 0.73835
> 
> ------------------------------------------
> 
> Term: Treatment:Hormone:as.factor(day)
> 
>  Response transformation matrix:
>             as.factor(day)1
> Diameter.23               1
> Diameter.39              -1
> 
> Sum of squares and products for the hypothesis:
>                 as.factor(day)1
> as.factor(day)1      0.04305625
> 
> Sum of squares and products for error:
>                 as.factor(day)1
> as.factor(day)1        15.41527
> 
> Multivariate Tests: Treatment:Hormone:as.factor(day)
>                  Df test stat   approx F num Df den Df Pr(>F)
> Pillai            1 0.0027853 0.03351708      1     12 0.8578
> Wilks             1 0.9972147 0.03351708      1     12 0.8578
> Hotelling-Lawley  1 0.0027931 0.03351708      1     12 0.8578
> Roy               1 0.0027931 0.03351708      1     12 0.8578
> 
> Univariate Type II Repeated-Measures ANOVA Assuming Sphericity
> 
>                                      SS num Df Error SS den Df      F  Pr(>F)
> Treatment                        0.3383      1   6.5296     12 0.6216 0.44574
> Hormone                          0.0458      1   6.5296     12 0.0841 0.77679
> Treatment:Hormone                0.5698      1   6.5296     12 1.0471 0.32636
> as.factor(day)                   5.8910      1   7.7076     12 9.1717 0.01050 *
> Treatment:as.factor(day)         0.5698      1   7.7076     12 0.8871 0.36484
> Hormone:as.factor(day)           0.0751      1   7.7076     12 0.1169 0.73835
> Treatment:Hormone:as.factor(day) 0.0215      1   7.7076     12 0.0335 0.85779
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> 
This works because you now have mod.ok as an "mlm" object.

> 
> 
> How I could use Anova from the first file format? If not, could you
> suggest me a way to recode my data file in R?
> 
> I ask because I don't know how I can recode my data file on R. Is ti possible?

Let's call your first data.frame dat.long. Then you can use:

  dat.wide <- reshape(dat, timevar="day",
               idvar = c("Hormone", "Block", "Treatment", "Plant"),
               direction = "wide")

Note that the two data frames you give are not consistent.

You could also investigate the reshape package which makes a
lot of reshaping easier.

  -Peter Ehlers

> 
> Thank you very much!
>