[R] ANCOVA for linear regressions without intercept

Steven McKinney smckinney at bccrc.ca
Tue Apr 5 04:38:08 CEST 2011


Hi Yusuke,

Does the following get what you are after?

### Make some test data.
> set.seed(123)
> edf <- data.frame(sex = c(rep("Male", 10), rep("Female", 10), rep("Unknown", 10)),
+                   head_length = c(1.2 * c(170:179 + rnorm(10)), 0.8 * c(150:159 + rnorm(10)), c(160:169 + rnorm(10)))/10,
+                   body_length = c(c(170:179 + rnorm(10)), c(150:159 + rnorm(10)), c(160:169 + rnorm(10)))
+                   )
> edf$sex <- factor(as.character(edf$sex))
> plot(edf$head_length, edf$body_length, pch = as.numeric(edf$sex), col = as.numeric(edf$sex), xlim = c(0, 25), ylim = c(0, 190))
> lmf <- lm(body_length ~ head_length * sex, data = edf)


### The full model - do keep an eye on those intercepts and try to ensure they are not far from 0.
> summary(lmf)

Call:
lm(formula = body_length ~ head_length * sex, data = edf)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.73783 -0.68133  0.02147  0.50858  2.38931 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -3.578     25.425  -0.141   0.8893    
head_length              12.772      2.054   6.218    2e-06 ***
sexMale                  15.122     37.464   0.404   0.6901    
sexUnknown               40.308     33.137   1.216   0.2357    
head_length:sexMale      -4.977      2.438  -2.042   0.0523 .  
head_length:sexUnknown   -4.971      2.428  -2.047   0.0517 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 1.384 on 24 degrees of freedom
Multiple R-squared: 0.9802,	Adjusted R-squared: 0.9761 
F-statistic: 237.7 on 5 and 24 DF,  p-value: < 2.2e-16 

### Now suppress intercepts.  head_length:sex should give interactions (slopes) only.
> lmrf <- lm(body_length ~ -1 + head_length : sex, data = edf)
> summary(lmrf)

Call:
lm(formula = body_length ~ -1 + head_length:sex, data = edf)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.02782 -0.61861 -0.01079  0.68785  2.57544 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
head_length:sexFemale  12.48253    0.03549   351.8   <2e-16 ***
head_length:sexMale     8.34500    0.02097   398.0   <2e-16 ***
head_length:sexUnknown 10.03844    0.02677   375.0   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 1.389 on 27 degrees of freedom
Multiple R-squared: 0.9999,	Adjusted R-squared: 0.9999 
F-statistic: 1.409e+05 on 3 and 27 DF,  p-value: < 2.2e-16 

### Check the numeric coding of the factor
> with(edf, table(sex, as.numeric(sex)))
         
sex        1  2  3
  Female  10  0  0
  Male     0 10  0
  Unknown  0  0 10

> abline(a = 0, b = coef(lmrf)[1], col = 1) ## Females = Black
> abline(a = 0, b = coef(lmrf)[2], col = 2) ## Males = Red
> abline(a = 0, b = coef(lmrf)[3], col = 3) ## Unknown = Green

### If no diff between males and females, then males and females can be combined into one group.
> edf$MvF <- as.character(edf$sex)
> edf$MvF[edf$MvF != "Unknown"] <- "MorF"
> edf$MvF <- factor(edf$MvF)
> with(edf, table(MvF, sex))
         sex
MvF       Female Male Unknown
  MorF        10   10       0
  Unknown      0    0      10
> lmr1f  <- lm(body_length ~ -1 + head_length : MvF, data = edf)
> summary(lmr1f)

Call:
lm(formula = body_length ~ -1 + head_length:MvF, data = edf)

Residuals:
    Min      1Q  Median      3Q     Max 
-23.976 -21.656   0.077  35.899  39.839 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
head_length:MvFMorF      9.4156     0.3429   27.46   <2e-16 ***
head_length:MvFUnknown  10.0384     0.5085   19.74   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 26.39 on 28 degrees of freedom
Multiple R-squared: 0.9761,	Adjusted R-squared: 0.9744 
F-statistic: 571.9 on 2 and 28 DF,  p-value: < 2.2e-16 

### Test the hypothesis that male and female heights are equivalent
> anova(lmr1f, lmrf)
Analysis of Variance Table

Model 1: body_length ~ -1 + head_length:MvF
Model 2: body_length ~ -1 + head_length:sex
  Res.Df     RSS Df Sum of Sq     F    Pr(>F)    
1     28 19496.1                                 
2     27    52.1  1     19444 10077 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

### Plot the reduced model regression lines
> abline(a = 0, b = coef(lmr1f)[1], col = "blue", lty = 2)
> abline(a = 0, b = coef(lmr1f)[2], col = "orange", lty = 2, lwd = 4)
> 

The other two tests can be set up and run similarly.  Don't
forget to adjust for multiple comparisons...

HTH

Steve


Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre


________________________________________
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Yusuke Fukuda [Yusuke.Fukuda at nt.gov.au]
Sent: April 4, 2011 5:45 PM
To: 'Peter Ehlers'
Cc: r-help at r-project.org; 'Bert Gunter'
Subject: Re: [R] ANCOVA for linear regressions without intercept

Thank you for your suggestions, stats experts. Much appreciated.

I still haven't got what I wanted but someone suggested looking into contrasts and this is looking worth trying http://finzi.psych.upenn.edu/R/library/gmodels/html/fit.contrast.html

Regards,

Yusuke

-----Original Message-----
From: Peter Ehlers [mailto:ehlers at ucalgary.ca]
Sent: Saturday, 2 April 2011 1:35 AM
To: Yusuke Fukuda
Cc: 'Bert Gunter'; r-help at r-project.org
Subject: Re: [R] ANCOVA for linear regressions without intercept

See inline.

On 2011-03-31 22:22, Yusuke Fukuda wrote:
> Thanks Bert.
>
> I have read "?formula" again and again, and I'm still struggling;
>
>> lm(body_length ~ head_length-1)
>
> This removes intercept from each individual regression (for male, female, unknown).
>
> When they are taken together,
>
>> lm(body_length ~ sex*head_length)
>
> This shows differences in slopes and intercepts between the regressions (but I want to compare the slopes of the regressions WITHOUT intercepts).
>
> If I put
>
>> lm(body_length ~ sex:head_length-1)
>
> This shows slopes for each sex without intercepts, but NOT differences in the slope between the regressions.

You probably want:

  lm(body_length ~ head_length + sex:head_length-1)

or, in short form:

  lm(body_length ~ head_length/sex-1)

You might then compare the model 'without intercepts'
(i.e. with intercepts forced to zero) with a model that
includes intercepts. If the intercepts turn out to
be significantly nonzero, what will you do?

Peter Ehlers

>
> I also tried
>
>> lm(body_length ~ sex*head_length-1)
>> lm(body_length ~ sex*head_length-sex-1)
>
> But none of them worked.
>
> Would anyone be able to help me? All I want to do is to compare the slopes of three linear regressions that go through the origin (0,0) so that I can say if their difference is significant or not.
>
> Thanks for your help.
>
>
>
> ________________________________________
> From: Bert Gunter [mailto:gunter.berton at gene.com]
> Sent: Friday, 1 April 2011 12:56 AM
> To: Yusuke Fukuda
> Cc: r-help at r-project.org
> Subject: Re: [R] ANCOVA for linear regressions without intercept
>
> If you haven't already received an answer, a careful reading of
>
> ?formula
>
> will provide it.
>
> -- Bert
> On Wed, Mar 30, 2011 at 11:42 PM, Yusuke Fukuda<Yusuke.Fukuda at nt.gov.au>  wrote:
>
> Hello R experts
>
> I have two linear regressions for sexes (Male, Female, Unknown). All have a good correlation between body length (response variable) and head length (explanatory variable). I know it is not recommended, but for a good practical reason (the purpose of study is to find a single conversion factor from head length to body length), the regressions need to go through the origin (0 intercept).
>
> Is it possible to do ANCOVA for these regressions without intercepts? When I do
>
> summary(lm(body length ~ sex*head length))
>
> this will include the intercepts as below
>
>
> Coefficients:
>                        Estimate Std. Error t value Pr(>|t|)
> (Intercept)            -6.49697    1.68497  -3.856 0.000118 ***
> sexMale                -9.39340    1.97760  -4.750 2.14e-06 ***
> sexUnknown             -1.33791    2.35453  -0.568 0.569927
> head_length             7.12307    0.05503 129.443<  2e-16 ***
> sexMale:head_length     0.31631    0.06246   5.064 4.37e-07 ***
> sexUnknown:head_length  0.19937    0.07022   2.839 0.004556 **
> ---
>
> Is there any way I can remove the intercepts so that I can simply compare the slopes with no intercept taken into account?
>
> Thanks for help in advance.
>
> Yusuke Fukuda
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list