[R] test the significances of two regression lines
(Ted Harding)
ted.harding at nessie.mcc.ac.uk
Mon Aug 6 13:16:09 CEST 2007
On 06-Aug-07 10:32:50, Luis Ridao Cruz wrote:
> R-help,
>
> I'm trying to test the significance of two regression lines
> , i.e. the significance of the slopes from two samples
> originated from the same population.
>
> Is it correct if I fit a liner model for each sample and
> then test the slope signicance with 'anova'. Something like this:
>
> lm1 <- lm(Y~ a1 + b1*X) # sample 1
> lm2 <- lm(Y~ a2 + b2*X) # sample 2
>
> anova(lm1, lm2)
No, this will not work. From "?anova":
Warning:
The comparison between two or more models will only be valid if
they are fitted to the same dataset.
which is not the case in your example. One way to proceed is to
merge the two datasets, and introduve a factor which identifies
the dataset. For example:
x1<-rnorm(100) ; x2<-rnorm(100)
y1 <- 0.2 + 0.1*x1 + 0.05*rnorm(100)
y2 <- 0.2 + 0.12*x2 + 0.05*rnorm(100)
x <- c(x1,x2)
y <- c(y1,y2)
S <- factor(c(rep(0,100),rep(1,100)))
lm12 <- lm(y ~ x*S)
First look at the fit of y1~x1:
summary(lm(y1~x1))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.206042 0.004647 44.34 <2e-16 ***
x1 0.0913820.091382 0.004768 19.16 <2e-16 ***
Then the fit of y2~x2:
summary(lm(y2~x2))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.208216 0.005171 40.26 <2e-16 ***
x2 0.118840 0.005009 23.73 <2e-16 ***
so the estimated slopes idiffere by 0.118840 - 0.091382 = 0.027458
But what is the "significance" of this difference?
Now:
summary(lm12)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.206042 0.004923 41.852 < 2e-16 ***
x 0.091382 0.005052 18.088 < 2e-16 ***
S1 0.002174 0.006953 0.313 0.754926
x:S1 0.027457 0.006939 3.957 0.000106 ***
so the "x:S1" value is the same as the difference in slopes
as estimated from lm1 and lm2; but now we have a standard error
and a P-value for it. You can also use anova now:
anova(lm12)
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
x 1 2.26537 2.26537 946.2702 < 2.2e-16 ***
S 1 0.00015 0.00015 0.0614 0.8045253
x:S 1 0.03749 0.03749 15.6599 0.0001060 ***
Residuals 196 0.46922 0.00239
so you get the same P-value, though with anova() you do not see
the actual estimate of the difference between the slopes.
Hoping this helps,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 06-Aug-07 Time: 12:16:06
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list