# [R] Analyzing an unbalanced AB/BA cross-over design

John Fox jfox at mcmaster.ca
Wed Jan 29 13:53:05 CET 2003

```Dear Martin,

As Peter Dalgaard has pointed out, the model you fit has redundant
parameters, as you can see from its summary (or from the coefficients):

> summary(example.lm)

. . .

Coefficients: (13 not defined because of singularities)

. . .

> coef(example.lm)
(Intercept)          treatmentS             period2
sequence2  sequence1:subject2
305.35714           -46.60714            15.89286
-135.00000                  NA
sequence2:subject2  sequence1:subject3  sequence2:subject3
sequence1:subject4  sequence2:subject4
222.50000                  NA           200.00000
-5.00000                  NA
sequence1:subject5  sequence2:subject5  sequence1:subject6
sequence2:subject6  sequence1:subject7
NA           240.00000            45.00000
NA           110.00000
sequence2:subject7  sequence1:subject9  sequence2:subject9
sequence1:subject10 sequence2:subject10
NA                  NA           150.00000
-60.00000                  NA
sequence1:subject11 sequence2:subject11 sequence1:subject12
sequence2:subject12 sequence1:subject13
75.00000                  NA                  NA
145.00000                  NA
sequence2:subject13 sequence1:subject14 sequence2:subject14
NA            57.50000                  NA

The computational procedure used by Anova (in the car package) assumes a
full-rank parametrization of the model. The difference between the Anova
methods for lm and glm is that the former uses the coefficient estimates
and their covariance matrix to compute sums of squares while the latter
refits the model. You got a result from SAS because it works with a
deficient-rank parametrization. Finally (and more generally), if you really
want "type-III" sums of squares in R, be careful with the kind of
contrast-coding that you use. The default "treatment" contrasts won't give
you what you want.

I'm sorry that you encountered this problem.
John

At 11:00 AM 1/29/2003 +0100, martin wolfsegger wrote:
>I am looking for help to analyze an unbalanced AB/BA cross-over design by
>requesting the type III SS !
>
># Example 3.1 from S. Senn (1993). Cross-over Trials in Clinical
>Research
>outcome<-c(310,310,370,410,250,380,330,270,260,300,390,210,350,365,370,310,380,290,260,90,385,400,410,320,340,220)
>subject<-as.factor(c(1,4,6,7,10,11,14,1,4,6,7,10,11,14,2,3,5,9,12,13,2,3,5,9,12,13))
>period<-as.factor(c(1,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2))
>treatment<-as.factor(c('F','F','F','F','F','F','F','S','S','S','S','S','S','S','S','S','S','S','S','S','F','F','F','F','F','F'))
>sequence<-as.factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2))
>example<-data.frame(outcome,subject,period,treatment,sequence)
>
>The recommended SAS code equals
>
>PROC GLM DATA=example;
>CLASS subject period treatment sequence;
>MODEL outcome = treatment sequence period subject(sequence);
>RANDOM subject(sequence);
>RUN;
>
>For PROC GLM, the random effects are treated in a post hoc fashion after the
>complete fixed effect model is fit. This distinction affects other features
>in the GLM procedure, such as the results of the LSMEANS and ESTIMATE
>statements. Looking only on treatment, period, sequence and subject
>effects, the
>random statement can be omitted.
>
>The R code for type I SS equals
>
>example.lm<-lm(outcome~treatment+period+sequence+subject%in%sequence,
>data=example)
>anova(example.lm)
>
>Response: outcome
>                  Df Sum Sq Mean Sq F value    Pr(>F)
>treatment         1  13388   13388 17.8416  0.001427 **
>period            1   1632    1632  2.1749  0.168314
>sequence          1    335     335  0.4467  0.517697
>sequence:subject 11 114878   10443 13.9171 6.495e-05 ***
>Residuals        11   8254     750
>
>According to the unbalanced design, I requested the type III SS which
>resulted in an error statement
>
>library(car)
>Anova(example.lm, type="III")
>
>Error in linear.hypothesis.lm(mod, hyp.matrix, summary.model = sumry,  :
>         One or more terms aliased in model.
>
>
>by using glm I got results with 0 df for the sequence effect !!!!
>
>example.glm<-glm(outcome~treatment+period+sequence+subject%in%sequence,
>data=example, family=gaussian)
>library(car)
>Anova(example.glm,type="III",test.statistic="F")
>
>Anova Table (Type III tests)
>
>Response: outcome
>                      SS Df       F    Pr(>F)
>treatment         14036  1 18.7044  0.001205 **
>period             1632  1  2.1749  0.168314
>sequence              0  0
>sequence:subject 114878 11 13.9171 6.495e-05 ***
>Residuals          8254 11
>
>
>The questions based on this output are
>
>1) Why was there an error statement requesting type III SS based on lm ?
>2) Why I got a result by using glm with 0 df for the period effect ?
>3) How can I get the estimate, the StdError for the constrast (-1,1) of the
>treatment effect ?
>
>
>Thanks
>
>--
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>http://www.stat.math.ethz.ch/mailman/listinfo/r-help

-----------------------------------------------------
John Fox
Department of Sociology
McMaster University