[R] How to fit an linear model withou intercept
John Sorkin
jsorkin at grecc.umaryland.edu
Thu Aug 23 15:28:50 CEST 2007
Michael,
Assuming you want a model with an intercept of zero, I think we need to ask you why you want an intercept of zero. When a "normal" regression indicates a non-zero intercet, forcing the regression line to have a zero intercept changes the meaning of the regression coefficients. If for some reason you want to have a zero intercept, but do not want to change the meaning of the regression coefficeints, i.e. you still what to minimize the sum of the square deviations from the BLUE (Best Leastsquares Unibiased Estimator) of the regression, you can center your dependent and indepdent variables re-run the regression. Centering means subtracting the mean of each variable from the variable before performing the regression. When you do this, the intercept term will be zero (or more likely a very, very, very small number that is not statisitclally different from zero - it will not be exactly zero due to limits on the precision of computer calculations) and the slope term will be the same as that you obtained from the "normal" BLUE regression. What you are actually doing is transforming your data so it is centered around x=0, y=0, i.e. the mean of the x and y terms will be zero. I am not sure this is what you want to do, but I am pasting below some R code that will allow you to see the affect fourcing the intercept to be zero has on the slope, and how centering the data yields a zero intercept without changing the slope.
John
oldpar<-par(ask=T)
# Set up x and y values. Note as defined the slope of the
# regression should be close to one (save for the "noise"
added to the y values) and the intercept should be close to
four.
x<-0:10
y<-x+4+rnorm(11,0,1)
plot(x,y)
title("Original data")
# Fit a "normal" regression line to the data and display
# the regression line on the scatter plot
fitNormalReg<-lm(y~x)
abline(fitNormalReg)
# Fit a regression line in which the intercept has been
# forced to be zero and display the line on the scattter
# plot.
fitZeroInt<-lm(y~-1+x)
abline(fitZeroInt,lty=2)
# Compare fits.
summary(fitNormalReg)
summary(fitZeroInt)
# There is a statistically significant difference
# between the models - the model with and intercetpt,
# the "normal" regression is the better fit.
anova(fit1,fit2)
# Center y and x by subtracting their means.
yCentered<-y-mean(y)
xCentered<-x-mean(x)
# Regress the centered y values on the centered x values. This
# will give us a model with an intercept that is very, very
# small. It would be zero save for the precision limits
# inherent in using a computer. Plot the line. Notice the
# slope of the centered is the same as that obtained from
# the normal regression.
fitCentered<-lm(yCentered~xCentered)
abline(fitCentered,lty=10)
# Compare the three regressions. Note the slope from the
# "normal" regression and centered regressions are the same.
# The intercept from the centered regression is very, very small
# and would be zero save for the limits of computer mathematics.
summary(fitNormalReg)
summary(fitZeroInt)
summary(fitCentered)
# Plot the centered data and show that the line goes through zero.
plot(xCentered,yCentered)
abline(fitCentered)
title("Centered data")
oldpar<-par(ask=T)
# Set up x and y values. Note as defined the slope of the
# regression should be close to one (save for the "noise"
added to the y values) and the intercept should be close to
four.
x<-0:10
y<-x+4+rnorm(11,0,1)
plot(x,y)
title("Original data")
# Fit a "normal" regression line to the data and display
# the regression line on the scatter plot
fitNormalReg<-lm(y~x)
abline(fitNormalReg)
# Fit a regression line in which the intercept has been
# forced to be zero and display the line on the scattter
# plot.
fitZeroInt<-lm(y~-1+x)
abline(fitZeroInt,lty=2)
# Compare fits.
summary(fitNormalReg)
summary(fitZeroInt)
# There is a statistically significant difference
# between the models - the model with and intercetpt,
# the "normal" regression is the better fit.
anova(fit1,fit2)
# Center y and x by subtracting their means.
yCentered<-y-mean(y)
xCentered<-x-mean(x)
# Regress the centered y values on the centered x values. This
# will give us a model with an intercept that is very, very
# small. It would be zero save for the precision limits
# inherent in using a computer. Plot the line. Notice the
# slope of the centered is the same as that obtained from
# the normal regression.
fitCentered<-lm(yCentered~xCentered)
abline(fitCentered,lty=10)
# Compare the three regressions. Note the slope from the
# "normal" regression and centered regressions are the same.
# The intercept from the centered regression is very, very small
# and would be zero save for the limits of computer mathematics.
summary(fitNormalReg)
summary(fitZeroInt)
summary(fitCentered)
# Plot the centered data and show that the line goes through zero.
plot(xCentered,yCentered)
abline(fitCentered)
title("Centered data")
par<-par(oldpar)
John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence
University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
jsorkin at grecc.umaryland.edu
>>> "David Barron" <mothsailor at googlemail.com> 08/23/07 5:38 AM >>>
A number of alternatives, such as:
lm(y ~ 0 + x)
lm(y ~ x -1)
See ?formula
On 8/23/07, Michal Kneifl <xkneifl at mendelu.cz> wrote:
> Please could anyone help me?
> How can I fit a linear model where an intercept has no sense?
> Thanks in advance..
>
> Michael
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
=================================
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Confidentiality Statement:
This email message, including any attachments, is for the so...{{dropped}}
More information about the R-help
mailing list