[R] Wrapper of linearHypothesis (car) for post-hoc of repeated measures ANOVA
Helios de Rosario
helios.derosario at ibv.upv.es
Thu Sep 22 18:40:55 CEST 2011
For some time I have been looking for a convenient way of performing
post-hoc analysis to Repeated Measures ANOVA, that would be acceptable
if sphericity is violated (i.e. leaving aside post-hoc to lme models).
The best solution I found was John Fox's proposal to similar requests
in R-help:
http://tolstoy.newcastle.edu.au/R/e2/help/07/09/26518.html
http://tolstoy.newcastle.edu.au/R/e10/help/10/04/1663.html
However, I think that using linearHypothesis() is not as
straightforward as I would desire for testing specific contrasts between
factor levels. The hypotheses must be defined as linear combinations of
the model coefficients (subject to response transformations according to
the intra-subjects design), which may need some involved calculations if
one is thinking on differences between "this and that" factor levels
(either between-subjects or intra-subjects), and the issue gets worse
for post-hoc tests on interaction effects.
For that reason, I have spent some time in writing a wrapper to
linearHypothesis() that might be helpful in those situations. I copy the
commented code at the end of this message, because although I have
successfully used it in some cases, I would like more knowledgeable
people to put it to test (and eventually help me create a worthwile
contribution for other people that could find it useful).
This function (which I have called "factorltest.mlm") needs the
multivariate linear model and the intrasubject-related arguments (idata,
idesign...) that would be passed on to Anova() (from car) for a repeated
measures analysis, or directly the Anova.mlm object returned by Anova()
instead of idata, idesign... (I have tried to explain it clearly in the
commentaries to the code.)
Moreover, it needs an argument "levelcomb": a list that represents the
level combinations of factors to be tested. There are different ways of
representing those combinations (through names of factor levels, or
coefficient vectors/matrices), and depending on the elements of that
list the test is made for main effects, simple effects, interaction
contrasts, etc.
For instance, let me use an example with the OBrienKaiser data set (as
in the help documentation for Anova() and linearHypothesis()).
The calculation of the multivariate linear model and Anova is copied
from those help files:
> phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5,
5)),
+ levels=c("pretest", "posttest", "followup"))
> hour <- ordered(rep(1:5, 3))
> idata <- data.frame(phase, hour)
> mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5,
+ post.1, post.2, post.3, post.4, post.5,
+ fup.1, fup.2, fup.3, fup.4, fup.5) ~
treatment*gender,
+ data=OBrienKaiser)
> av.ok <- Anova(mod.ok, idata=idata, idesign=~phase*hour)
Then, let's suppose that I want to test pairwise comparisons for the
significant main effect "treatment" (whose levels are
c("control","A","B")). For the specific contrast between treatment "A"
and the "control" group I can define "levelcomb" in the following
(equivalent) ways:
> levelcomb <- list(treatment=c("A","control"))
> levelcomb <- list(treatment=c(A=1,control=-1))
> levelcomb <- list(treatment=c(-1,1,0))
Now, let's suppose that I am interested in the (marginally) significant
interaction between treatment and phase. First I test the simple main
effect of phase for different levels of treament (e.g. for the "control"
group). To do this, levelcomb must have one variable for each
interacting factor (treatment and phase): levelcomb$treatment will
specify the treatment that I want to fix for the simple main effects
test ("control"), and levelcomb$phase will have a NA value to represent
that I want to test all orthogonal contrasts within that factor:
> levelcomb <- list(treatment="control",phase=NA)
I could also use numeric vectors to define the levels of "treatment"
that I want to fix, as in the previous example, or if I want a more
complicated combination (e.g. if I want to test the phase effect for
pooled treatments "A" and "B"):
> levelcomb <- list(treatment=c(A=1,B=1),phase=NA)
The NA value can be replaced by the specific contrast matrix that I
would like to use. For instance, the previous statement is equivalent to
the following one:
> levelcomb <- list(treatment=c(0,1,1),phase=contr.sum(levels(phase)))
And then, let's see an example of interaction contrast: I want to see
if the difference between the "pre-test" phase and the following two,
itself differs between the "control" group and the two treatments. This
would be
> levelcomb <- list(treatment=c(2,-1,-1),phase=c(2,-1,-1))
or
> levelcomb <-
list(treatment=c(control=2,A=-1,B=-1),phase=c(pretest=2,posttest=-1,followup=-1))
etc.
Now, to perform the test I just use this "levelcomb" list object with
the model and ANOVA previously performed, in the following way:
> factorltest.mlm(mod.ok,levelcomb,av.ok)
Or if I want to use idata and idesign directly:
> factorltest.mlm(mod.ok,levelcomb,idata=idata,idesign=~phase*hour)
Of course, this function only performs one test at a time. To perform
multiple tests as suggested by John, the user should run it in a loop
(and then correct the p-values as he or she see fit).
If you find this useful, please let me know if you also find some
error, opportunity of improvement, or any commentary.
Thanks,
Helios
[Now follows the full commented code:]
# factorltest.mlm(modmlm,levelcomb,aovmlm,...)
#
# Hypothesis testing for linear contrasts of factor levels, for
multivariate
# linear models with response transformation, as in a repeated-measures
ANOVA.
#
# Arguments:
# modmlm: multivariate linear model (mlm object)
# levelcomb: list with the combinations of levels for each factor, in
form of
# a literal expression or numeric matrix (see Details).
# aovmlm: result from Anova to modmlm (i.e. Anova.mlm object)
# ...: additional arguments passed to Anova() --from the car package--
to
# perform an ANOVA with an intra-subject design (see Details).
#
# Details:
# An intra-subject design is required for the response transformation
of the
# linear model. The intra-subject design may be defined by the
arguments
# idata, idesgin (and optionally icontrasts) defined for the function
Anova()
# in the car package, or alternatively by an Anova.mlm object resulting
from
# that function (applied to modmlm). If both modes of defining the
intra-subjects
# design are used, the design implied by aovmlm prevales.
#
# The name of each element in levelcomb must coincide with a factor
(i.e. a main
# effect) analysed in the ANOVA (as defined in aovmlm or indirectly by
modmlm and
# the other arguments). The value of each element can be: (1) the name
of one level
# of that factor, (2) a vector with two level names of that factor --to
perform
# a paired contrast between those levels--, (3) a named vector of
coefficients
# whose names coincide with some levels of the factor, for more
specific linear
# combinations --unspecified coefficients are assumed to be zero--, (4)
an unnamed
# vector of coefficients whose length is equal to the number of levels
in the
# factor --the values of the vector are assigned to the factor levels
in the
# default order--, (5) a matrix with named or unnamed rows, in which
each column
# represents a combination of levels as in (3) or (4).
#
# This function depends on linearHypothesis() in the car package. The
combinations
# specified in levelcomb are tested against zero or a matrix of zeros.
factorltest.mlm <- function(modmlm,levelcomb,aovmlm,...){
## AUXILIARY FUNCTIONS
# checkfactors() is used to check consistency of level
combinations in the
# factors defined in levelcomb, and convert them to numeric
matrices
checkfactors <- function(fframe,fnames,levelcomb){
for (fname in fnames){
f <- fframe[[fname]]
fc <- levelcomb[[fname]]
# Literal expressions must represent one level
or a pair of levels
if ((is.character(fc)) & (!any(is.na(fc)))){
fcnames <- fc
if (length(fc)==1){
# If there is one level,
evaluate the values at that level
fc <- 1
names(fc) <- fcnames
}else if (length(fcnames)==2){
# If there is a pair of levels,
evaluate the contrast
fc <- c(1,-1)
names(fc) <- fcnames
}else{
stop("Incorrect number of
literal levels for factor \"",fname,"\" (must be 1 or 2).")
}
}
# Check the numeric coefficients of the factor
levels
if ((is.numeric(fc)) & (!any(is.na(fc)))){
# Make fc into a matrix (in case it was
a vector)
fc <- as.matrix(fc)
# Unnamed coefficient matrices must have
the same rows as levels in the factor
if (is.null(rownames(fc))){
if (nrow(fc) != nlevels(f)){
stop("Mismatch in the
number of unnamed factor levels for \"",fname,"\".")
}else{
rownames(fc) <-
levels(f)
}
}else{
# Named coefficients are
assigned to a matrix of factor levels, filled in with zeros
flevels <-
match(rownames(fc),levels(f))
fctmp <- fc
fc <-
matrix(0,nrow=nlevels(f),ncol=ncol(fctmp))
if (any(is.na(flevels))){
stop("Mismatch in the
names of factor levels for \"",fname,"\".")
}
fc[flevels,] <- fctmp
rownames(fc) <- levels(f)
}
# Convert NA into default contrast matrix
}else if (is.na(fc)){
fc <- contrasts(f)
}else{
stop("Invalid value assigned to factor
\"",fname,"\".")
}
levelcomb[[fname]] <- fc
}
return(levelcomb)
}
# makeT() creates a transformation matrix, used to define the
Linear Hypothesis
# matrix (for between-subjects factors) or the response
transformation matrix
# (for within-subjects factors), depending on the combinations
of factor levels.
# The transformation matrix is created progressively, by
"translating" the combinations
# of each factor into matrices that are sequentially
multiplied.
makeT <- function(fframe,fnames,levelcomb){
# First matrix, defined from the identic rows of the
between/within-subjects
# model data frame, when unspecified factors are
removed.
# A dummy column with ones is added to the model data
frame, to avoid
# problems when all columns be eventually removed.
# (The name of the dummy column is coerced to be
different from any other one.)
dummyname <- paste("z",max(fnames),sep=".")
fframe[[dummyname]] <- 1
# All factors are interacting, the transformation matrix
(Tm) in the first step is diagonal.
if (length(fnames)==ncol(fframe)-1){
m <- nrow(fframe)
Tm <- diag(m)
}else{
# Remove columns of unspecified factors
fframe <- fframe[,c(fnames,dummyname)]
# Vector with a different value for each
combination of factors
fframe_1 <- apply(fframe,1,paste,collapse="")
n <- nrow(fframe)
# Subset of unique elements
fframe <- unique(fframe)
fframe_1.m <- unique(fframe_1)
m <- nrow(fframe)
# Matrix with coefficients for averaging
identical combinations of factors
# Rows in Mo are the original (repeated)
combinations
To <- matrix(rep(fframe_1,each=m),nrow=m)
# Columns in Mb are the unique combinations
Tu <- matrix(rep(fframe_1.m,n),nrow=m)
Tm <- (To==Tu) * m/n
}
# Number of contrasts to calculate for the current
factor (initialized as 1)
nc <- 1
# Labels for the final transformation matrix (only
defined for multiple contrasts)
Tlabels <- NULL
# Progressive transformation of Tm, factor by factor
for (fname in fnames){
# f is the current factor vector
f <- fframe[[fname]]
# n is the factor vector length
n <- m*nc
# Remove the corresponding column from the model
data frame
fframe[[fname]] <- NULL
# nc is the number of contrasts for the current
factor (updated)
nc <- ncol(levelcomb[[fname]])
if (nc > 1){
# If there are multiple contrasts, the
rows of the model data frame are
# replicated (by the number of
contrasts), and a new column is added to
# assign a contrast to each group of
copied rows.
fframe[[ncol(fframe)+1]] <- 1
fframe <- fframe[rep(1:n,nc),]
fframe[[ncol(fframe)]] <-
rep(1:nc,each=n)
# Moreover, we create labels to identify
what contrast is represented
# by each row of the final
transformation matrix
newTlabels <-
paste(fname,as.character(1:nc),sep="")
if (is.null(Tlabels)){
Tlabels <- newTlabels
}else{
# (In case there is more than
one factor with multiple contrasts)
Tlabels <-
paste(rep(Tlabels,nc), rep(newTlabels,each=length(Tlabels)), sep=":")
}
}
# The same routine as for the first Tm: vector
with combined values
# of the (transformed) model data frame...
fframe_1 <- apply(fframe,1,paste,collapse="")
# ... subset to unique rows
fframe <- unique(fframe)
fframe_1.m <- unique(fframe_1)
m <- nrow(fframe)/nc
# And create transformation matrix, depending on
the combination of factor
# levels defined in levelcomb
kf <- t(levelcomb[[fname]])
kf <- matrix(rep(kf[,f],each=m),ncol=n)
To <-
matrix(rep(matrix(fframe_1,ncol=n,byrow=TRUE),each=m),ncol=n)
Tu <- matrix(rep(fframe_1.m,n),ncol=n)
Tm <- ((To==Tu) * kf) %*% Tm
}
# When the loop is finished, assign labels to final Tm,
and return it
rownames(Tm) <- Tlabels
return(Tm)
}
## END OF AUXILIARY FUNCTIONS
## MAIN PROCEDURE
# 1. Check class of modmlm and intra-subject design from input
arguments
if (missing(aovmlm)){
aovmlm <- Anova(modmlm,...)
}
# 2. Define model data frames:
# Between-subjects model data frame, with contrasts copied from
linear model
bframe <- expand.grid(modmlm$xlevels)
bfactors <- names(bframe)
for (bf in bfactors){
contrasts(bframe[[bf]]) <- modmlm$contrasts[[bf]]
}
# Within-subjects model data frame, with contrasts copied from
intra-subject design
wframe <- aovmlm$idata
wfactors <- names(wframe)
for (wf in wfactors){
if (is.null(attr(wframe[[wf]], "contrasts"))){
contrasts(wframe[[wf]]) <- if
(is.ordered(wframe[[wf]])) aovmlm$icontrasts[2] else
aovmlm$icontrasts[1]
}
}
# 3. Check that interacting factors in levelcomb are included in
both the
# between-subjects and within-subject designs
ifactors <- names(levelcomb)
iterm <- paste(ifactors,collapse=":")
itermsort <- paste(sort(ifactors),collapse=":")
anovaterms <-
lapply(strsplit(aovmlm$terms,":"),function(x){paste(sort(x),collapse=":")})
if (all(is.na(match(anovaterms,itermsort)))){
warning("The term \"", iterm, "\" is not in the model
design.")
}
# 4. Search factors of levelcomb in bframe and wframe, and
convert the level
# combinations into numeric matrix format
bfactor.interact <- match(ifactors,bfactors)
bfactors <-
bfactors[bfactor.interact[!is.na(bfactor.interact)]]
levelcomb <- checkfactors(bframe,bfactors,levelcomb)
wfactor.interact <- match(ifactors,wfactors)
wfactors <-
wfactors[wfactor.interact[!is.na(wfactor.interact)]]
levelcomb <- checkfactors(wframe,wfactors,levelcomb)
# 5. Preliminary definition of the Linear Hypothesis matrix (L)
rhf <- paste(as.character(formula(modmlm))[c(1,3)],collapse="
")
L <- model.matrix(as.formula(rhf), data=bframe)
# 6. Transformed Linear Hypothesis (L) and response
transformation (P) matrices
if (length(bfactors)){
L <- makeT(bframe,bfactors,levelcomb) %*% L
}else{
L <- colSums(L)/nrow(L)
}
if (length(wfactors)){
P <- t(makeT(wframe,wfactors,levelcomb))
}else{
P <- matrix(rep(1/nrow(wframe),nrow(wframe)))
}
# 7. Result, consisting in:
# levelcomb: numeric matrix values of levelcomb
# lsm: table of least square means for the tested
interactions
# test: test value, from LinearHypothesis
result <- list(levelcomb=levelcomb,lsm=NULL,test=NULL)
result$lsm <- L %*% modmlm$coefficients %*% P
result$test <- linearHypothesis(modmlm,L,P=P)
return(result)
}
INSTITUTO DE BIOMECÁNICA DE VALENCIA
Universidad Politécnica de Valencia • Edificio 9C
Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
www.ibv.org
Antes de imprimir este e-mail piense bien si es necesario hacerlo.
En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
de Datos de Carácter Personal, le informamos de que el presente mensaje
contiene información confidencial, siendo para uso exclusivo del
destinatario arriba indicado. En caso de no ser usted el destinatario
del mismo le informamos que su recepción no le autoriza a su divulgación
o reproducción por cualquier medio, debiendo destruirlo de inmediato,
rogándole lo notifique al remitente.
More information about the R-help
mailing list