[R] question about linear models.
(Ted Harding)
Ted.Harding at nessie.mcc.ac.uk
Mon Apr 19 21:28:31 CEST 2004
This would make a good exam question!
First, look at the distribution of levels:
B=0 B=1 B=2 B=3
A=0 6 -- -- --
A=1 -- 4 3 2
A=2 -- 2 3 4
And then look at the mean values within combinations of levels:
B=0 B=1 B=2 B=3
A=0 1.15 -- -- -- | 1.15
A=1 -- 1.81 1.85 1.52 | 1.76
A=2 -- 2.31 2.52 2.13 | 2.30
----------------------------+------
1.15 1.98 2.18 1.93 | 1.81
(Residual SE after fitting A+B = 0.38)
First, it is clear that (A=0) vs (A>0) is exactly associated
with (B=0) vs (B>0). Therefore any difference between means
for (A=0) vs (A>0) is fully confounded with (B=0) vs (B>0).
Clearly (from table of means) there *is* a difference here
(significant as it turns out), so fitting A alone will give
a significant result as will fitting B alone.
Further (table of means) the response increases almost linearly
with A (about 0.6/level), while it does not change much for
(B=1/2/3). So almost all if the variation with respect to B
is accounted for by the difference between (B=0) and (B>0)
which is totally confounded with A. Therefore, once you have
fitted A, fitting B as an additional variate will not change
the fit significantly.
However, if you fit B first followed by adding A, you first
(B fit) take out the difference between (B=0) vs (B>0),
equivalent to (A=0) vs (A>0). However, from inspection of
table of means, while there is little differfence between
(B=1)/(B=2)/(B=3) nevertheless there is a systematic difference
at each level of B between (A=1) and (A=2) -- 0.5, 0.67
and 0.61 respectively. This shows up as an effect of A after
fitting B.
So, in summary, there is a significant effect of A alone (due
to the constant increase per increment in level); there is a
significant effect of B alone (due to the contrast between
(B=0) and (B>0) equivalent to the contrast between (A=0)
and (A>0)); however, once the effect of A has been allowed
for you only have the contrast between levels (B=1)/(B=2)/(B=3)
of B which do not differ enough to be significant. On the other
hand, fitting B first still leaves a constant effect of A
at each of the levels of B which shows up as significant for
A after fitting B. You do not have enough data to detect as
significant the sort of differences between levels of B=1/2/3.
Best wishes,
Ted.
==================================================================
On 19-Apr-04 ivan.borozan at utoronto.ca wrote:
> i have the following table with two factors A, B each respectively
> with 3 and 4 levels (unbalanced design)
>
>>S1
> samples A B
> 1 1.3398553 0 0
> 2 0.8455924 0 0
> 3 1.0290893 0 0
> 4 1.2720512 0 0
> 5 1.2071754 0 0
> 6 1.1859539 0 0
> 7 2.7399659 2 3
> 8 1.2476911 2 3
> 9 2.6389479 2 2
> 10 1.6914068 1 2
> 11 2.2260561 2 1
> 12 1.2955187 1 1
> 13 1.6526140 1 3
> 14 2.3159151 2 3
> 15 2.3905009 1 2
> 16 2.9520105 2 2
> 17 1.9478868 1 1
> 18 1.9936118 1 1
> 19 1.3775338 1 3
> 20 1.9638190 2 2
> 21 1.4697860 1 2
> 22 2.2028858 2 3
> 23 2.4024771 2 1
> 24 1.9935864 1 1
>
>
> i fit two different models
>
> fit1<-aov(samples~A + B,data=S1,contrasts = list(A = contr.treatment, B
> =
> contr.treatment))
> fit2<-aov(samples~A,data=S1,contrasts = list(A = contr.treatment))
> fit3<-aov(samples~B,data=S1,contrasts = list(B = contr.treatment))
>
>
> and using
>
>>anova(fit1,fit2)
> Analysis of Variance Table
>
> Model 1: samples ~ A + B
> Model 2: samples ~ A
> Res.Df RSS Df Sum of Sq F Pr(>F)
> 1 19 2.74820
> 2 21 3.14667 -2 -0.39847 1.3774 0.2763
>
> i get B as not significant and
>
>
>>anova(fit1,fit3)
>
> Analysis of Variance Table
>
> Model 1: samples ~ A + B
> Model 2: samples ~ B
> Res.Df RSS Df Sum of Sq F Pr(>F)
> 1 19 2.7482
> 2 20 4.2391 -1 -1.4909 10.308 0.004604 **
>
> A as significant.
>
>
>
> however if i do
>
>>anova(fit3)
>
> Analysis of Variance Table
>
> Response: samples
> Df Sum Sq Mean Sq F value Pr(>F)
> B 3 3.7241 1.2414 5.8567 0.004854 **
> Residuals 20 4.2391 0.2120
>
>
> i get B as significant and
>
>>anova(fit2)
>
> Analysis of Variance Table
>
> Response: samples
> Df Sum Sq Mean Sq F value Pr(>F)
> A 2 4.8165 2.4083 16.072 5.835e-05 ***
> Residuals 21 3.1467 0.1498
>
> A as significant.
>
>
>
>
> Should i conclude that A is significant and B is not or rather that
> both factors
> are significant ?
>
>
> all the best
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 167 1972
Date: 19-Apr-04 Time: 20:28:31
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list