[R] car::Anova - Can it be used for ANCOVA with repeated-measures factors.

Henrik Singmann henrik.singmann at psychologie.uni-freiburg.de
Mon Jul 23 00:56:16 CEST 2012


Dear John,

indeed, you are very right. Including the covariate as is, doesn't make any sense. The only correct way would be to center it on the mean beforehands. So actually the examples in my first and second mail are bogus (I add a corrected example at the end) and the reported test do not make much sense.

Let me try to explain why I want to discard the interactions of the covariate with the within-factors. The reason I want to exclude them is that I want to stay within the ANCOVA framework. I looked at the three books on experimental design I have on my desk (Winer, 1971; Kirk, 1982; Maxwell & Delaney, 2003) and they unanimously define the ANCOVA as the ANOVA on the responses controlled for the covariate only (i.e., not controlled for the covariate and the interactions with the other effects).
However, as you say, adding or removing an interaction with the orthogonal within-subject factors does indeed not alter the results (example at the end), so one could just use the output and discard the unwanted effects, although admittedly this seems sketchy given significant effects.

Unfortunately, my involvement with this issue has led me to another question. Winer and Kirk both discuss a split-plot ANCOVA in which one has measured a covariate for each observation. That is a second matrix alike the original data matrix, e.g. the body temperature of each person at each measurement for the OBrienKaiser dataset:

OBK.cov <- OBrienKaiser
OBK.cov[,-(1:2)] <- runif(16*15, 36, 41)

Would it be possible to fit the data using this temperature matrix as a covariate using car::Anova (I thought about this but couldn't find any idea of how to specify the imatrix)?

Thanks a lot for the helpful responses,
Henrik


PS: Better examples:
# compare the treatment and the phase effect across models.
require(car)
set.seed(1)

# using scale for the covariate:
n.OBrienKaiser <- within(OBrienKaiser, age <- scale(sample(18:35, size = 16, replace = TRUE), scale = FALSE))

phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)), levels=c("pretest", "posttest", "followup"))
hour <- ordered(rep(1:5, 3))
idata <- data.frame(phase, hour)

# Full ANCOVA model:
mod.1 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender + age, data=n.OBrienKaiser)
(av.1 <- Anova(mod.1, idata=idata, idesign=~phase*hour, type = 3))

#                             Df test stat approx F num Df den Df      Pr(>F)
# (Intercept)                  1     0.968    269.4      1      9 0.000000052 ***
# treatment                    2     0.443      3.6      2      9      0.0719 .
# gender                       1     0.305      3.9      1      9      0.0782 .
# age                          1     0.054      0.5      1      9      0.4902
# treatment:gender             2     0.222      1.3      2      9      0.3232
# phase                        1     0.811     17.2      2      8      0.0013 **
# ...

# removing the between-subject interaction does alter the lower order effects:
mod.2 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment + gender + age, data=n.OBrienKaiser)
(av.2 <- Anova(mod.2, idata=idata, idesign=~phase*hour, type = 3))

# Type III Repeated Measures MANOVA Tests: Pillai test statistic
#                      Df test stat approx F num Df den Df       Pr(>F)
# (Intercept)           1     0.959    254.5      1     11 0.0000000059 ***
# treatment             2     0.428      4.1      2     11      0.04644 *
# gender                1     0.271      4.1      1     11      0.06832 .
# age                   1     0.226      3.2      1     11      0.10030
# phase                 1     0.792     19.0      2     10      0.00039 ***
# ...

# removing the within-subject interaction does NOT alter the lower order effects:
mod.3 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender + age, data=n.OBrienKaiser)
(av.3 <- Anova(mod.3, idata=idata, idesign=~phase+hour, type = 3))
# Type III Repeated Measures MANOVA Tests: Pillai test statistic
#                        Df test stat approx F num Df den Df      Pr(>F)
# (Intercept)             1     0.968    269.4      1      9 0.000000052 ***
# treatment               2     0.443      3.6      2      9      0.0719 .
# gender                  1     0.305      3.9      1      9      0.0782 .
# age                     1     0.054      0.5      1      9      0.4902
# treatment:gender        2     0.222      1.3      2      9      0.3232
# phase                   1     0.811     17.2      2      8      0.0013 **
# ...



Am 22.07.2012 23:25, schrieb John Fox:
> Dear Henrik,
>
> The within-subjects contrasts are constructed by Anova() to be orthogonal in the row-basis of the design, so you should be able to safely ignore the effects in which (for some reason that escapes me) you are uninterested. This would also be true (except for the estimated error) for the between-subjects design if you used "type-II" tests. It's true that the "type-III" between-subjects tests will be affected by the presence of an interaction, but for these tests to make sense at all, you have to formulate the model very carefully. For example, your type-III test for the "main effect" of treatment with the interaction in the model is for the treatment effect at age 0. Does that really make sense to you? Indeed, the type-III tests for the ANOVA (not ANCOVA) model only make sense because I was careful to use contrasts for the between-subjects factors that are orthogonal in the basis of the design:
>
>   > contrasts(OBrienKaiser$treatment)
>          [,1] [,2]
> control   -2    0
> A          1   -1
> B          1    1
>
>> contrasts(OBrienKaiser$gender)
>    [,1]
> F    1
> M   -1
>
> Best,
>   John
>
> On Sun, 22 Jul 2012 22:06:58 +0200
>   Henrik Singmann <henrik.singmann at psychologie.uni-freiburg.de> wrote:
>> Dear John,
>>
>> thanks for your response. But if I simply ignore the unwanted effects, the estimates of the main effects for the within-subjects factors are distroted (rationale see below). Or doesn't this hold for between-within interactions?
>>
>> Or put another way: Do you think this approach is the correct way of running an ANCOVA involving within-subject factors?
>>
>> As far as I understand ANCOVA, the covariate(s) should only be additive factors and do not interact with the factors of interest:
>> "Suppose that differences in [the mean of the covariate] are due to sources of variation related to [the mean of the dependent variable], but not directly related to the treatment effects." (Winer, 1972, p. 753, the parts in squared bracktes exchange the mathematical symbols with the definition).
>>
>> Best,
>> Henrik
>>
>> PS: Showing that adding the interaction term massively changes the main effect for a between-factor:
>>
>>> # The ANCOVA:
>>> Anova(lm(pre.1 ~ treatment + age, data = n.OBrienKaiser), type = 3)
>> Anova Table (Type III tests)
>>
>> Response: pre.1
>>               Sum Sq Df F value Pr(>F)
>> (Intercept)    0.0  1    0.01   0.90
>> treatment      0.3  2    0.06   0.94
>> age            4.5  1    1.54   0.24
>> Residuals     34.9 12
>>>
>>> # The ANOVA:
>>> Anova(lm(pre.1 ~ treatment, data = n.OBrienKaiser), type = 3)
>> Anova Table (Type III tests)
>>
>> Response: pre.1
>>               Sum Sq Df F value     Pr(>F)
>> (Intercept)  225.6  1   74.47 0.00000097 ***
>> treatment      1.1  2    0.17       0.84
>> Residuals     39.4 13
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>>
>>> # The model with interaction
>>> Anova(lm(pre.1 ~ treatment * age, data = n.OBrienKaiser), type = 3)
>> Anova Table (Type III tests)
>>
>> Response: pre.1
>>                 Sum Sq Df F value Pr(>F)
>> (Intercept)     3.01  1    1.40  0.264
>> treatment      13.71  2    3.18  0.085 .
>> age            11.56  1    5.37  0.043 *
>> treatment:age  13.37  2    3.11  0.089 .
>> Residuals      21.53 10
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>>
>>
>>
>> Am 22.07.2012 16:59, schrieb John Fox:
>>> Dear Henrik,
>>>
>>> As you discovered, entering the covariate age additively into the between-subject model doesn't prevent Anova() from reporting tests for the interactions between age and the within-subjects factors. I'm not sure why you would want to do so, but you could simply ignore these tests.
>>>
>>> I hope this helps,
>>>    John
>>>
>>> --------------------------------
>>> John Fox
>>> Senator William McMaster
>>>     Professor of Social Statistics
>>> Department of Sociology
>>> McMaster University
>>> Hamilton, Ontario, Canada
>>> http://socserv.mcmaster.ca/jfox
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>>>> project.org] On Behalf Of Henrik Singmann
>>>> Sent: July-21-12 1:29 PM
>>>> To: r-help at stat.math.ethz.ch
>>>> Subject: [R] car::Anova - Can it be used for ANCOVA with repeated-
>>>> measures factors.
>>>>
>>>> Dear list,
>>>>
>>>> I would like to run an ANCOVA using car::Anova with repeated measures
>>>> factors, but I can't figure out how to do it. My (between-subjects)
>>>> covariate always interacts with my within-subject factors.
>>>> As far as I understand ANCOVA, covariates usually do not interact with
>>>> the effects of interest but are simply additive (or am I wrong here?).
>>>>
>>>> More specifically, I can add a covariate as a factor to the between-
>>>> subjects part when fitting the MLM that behaves like expected (i.e.,
>>>> does not interact with the other factors), but when calling Anova on
>>>> the model, I don't know how I can specify the between-within design
>>>> (i.e., which parts of the model should interact with the repeated
>>>> measures factors).
>>>>
>>>> As far as I understand it, neither the idesign, icontrasts or imatrix
>>>> arguments, nor the linearHypothesis function can specify the within-
>>>> between design (as far as I get it they all specify the within or
>>>> intra-subject design, see John Fox's slides from User 2011:
>>>> http://web.warwick.ac.uk/statsdept/useR-
>>>> 2011/TalkSlides/Contributed/17Aug_1705_FocusV_4-Multivariate_1-
>>>> Fox.pdf).
>>>>
>>>> If this it is not possible using car::Anova, is there another way to
>>>> achiebve what I want or is it plainly wrong?
>>>> I have the feeling that using R's "New Functions for Multivariate
>>>> Analysis" (Dalgaard, 2007, R News) this could be possible, but some
>>>> advice on how, would be greatly appreciated, as this does not seem to
>>>> be the most straight forward way.
>>>>
>>>> Below is an example using the car::OBrienKaiser dataset adding an age
>>>> covariate. The example is merely an adoption from ?Anova with miniml
>>>> changes and includes e.g. age:phase:hour which I don't want to have.
>>>>
>>>> Note that I posted this question to stackoverflow two days ago
>>>> (http://stackoverflow.com/q/11567446/289572) and did not receive any
>>>> responses. Please excuse my "crossposting", but I think R-help may be
>>>> the better place.
>>>>
>>>> Best,
>>>> Henrik
>>>>
>>>> PS: I know that the posting guide says "No questions about contributed
>>>> packages" but there are some questions about car on R-help, so I
>>>> thought this would be the correct place.
>>>>
>>>> ###### Example follows #####
>>>>
>>>> require(car)
>>>> set.seed(1)
>>>>
>>>> n.OBrienKaiser <- within(OBrienKaiser, age <- sample(18:35, size = 16,
>>>> replace = TRUE))
>>>>
>>>> phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)),
>>>> levels=c("pretest", "posttest", "followup")) hour <- ordered(rep(1:5,
>>>> 3)) idata <- data.frame(phase, hour)
>>>>
>>>> mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2,
>>>> post.3, post.4, post.5,
>>>>              fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender +
>>>> age, data=n.OBrienKaiser) (av.ok <- Anova(mod.ok, idata=idata,
>>>> idesign=~phase*hour, type = 3))
>>>>
>>>> # Type II Repeated Measures MANOVA Tests: Pillai test statistic
>>>>                                # Df test stat approx F num Df den Df
>>>> Pr(>F)
>>>> # (Intercept)                  1     0.971    299.9      1      9
>>>> 0.000000032 ***
>>>> # treatment                    2     0.492      4.4      2      9
>>>> 0.04726 *
>>>> # gender                       1     0.193      2.1      1      9
>>>> 0.17700
>>>> # age                          1     0.045      0.4      1      9
>>>> 0.53351
>>>> # treatment:gender             2     0.389      2.9      2      9
>>>> 0.10867
>>>> # phase                        1     0.855     23.6      2      8
>>>> 0.00044 ***
>>>> # treatment:phase              2     0.696      2.4      4     18
>>>> 0.08823 .
>>>> # gender:phase                 1     0.079      0.3      2      8
>>>> 0.71944
>>>> # age:phase                    1     0.140      0.7      2      8
>>>> 0.54603
>>>> # treatment:gender:phase       2     0.305      0.8      4     18
>>>> 0.53450
>>>> # hour                         1     0.939     23.3      4      6
>>>> 0.00085 ***
>>>> # treatment:hour               2     0.346      0.4      8     14
>>>> 0.92192
>>>> # gender:hour                  1     0.286      0.6      4      6
>>>> 0.67579
>>>> # age:hour                     1     0.262      0.5      4      6
>>>> 0.71800
>>>> # treatment:gender:hour        2     0.539      0.6      8     14
>>>> 0.72919
>>>> # phase:hour                   1     0.663      0.5      8      2
>>>> 0.80707
>>>> # treatment:phase:hour         2     0.893      0.3     16      6
>>>> 0.97400
>>>> # gender:phase:hour            1     0.700      0.6      8      2
>>>> 0.76021
>>>> # age:phase:hour               1     0.813      1.1      8      2
>>>> 0.56210
>>>> # treatment:gender:phase:hour  2     1.003      0.4     16      6
>>>> 0.94434
>>>> # ---
>>>> # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>>>
>>>>
>>>> --
>>>> Dipl. Psych. Henrik Singmann
>>>> PhD Student
>>>> Albert-Ludwigs-Universität Freiburg
>>>> http://www.psychologie.uni-freiburg.de/Members/singmann
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>> guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>> Dipl. Psych. Henrik Singmann
>> PhD Student
>> Albert-Ludwigs-Universität Freiburg
>> http://www.psychologie.uni-freiburg.de/Members/singmann
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Dipl. Psych. Henrik Singmann
PhD Student
Albert-Ludwigs-Universität Freiburg, Germany
http://www.psychologie.uni-freiburg.de/Members/singmann



More information about the R-help mailing list