[R] lm without intercept
Joshua Wiley
jwiley.psych at gmail.com
Sun Jul 29 06:52:24 CEST 2012
Hi,
R actually uses a different formula for calculating the R square
depending on whether the intercept is in the model or not.
You may also find this discussion helpful:
http://stats.stackexchange.com/questions/7948/when-is-it-ok-to-remove-the-intercept-in-lm/
If you conceptualize R^2 as the squared correlation between the
oberserved and fitted values, it is easy to get:
summary(m0 <- lm(mpg ~ 0 + disp, data = mtcars))
summary(m1 <- lm(mpg ~ disp, data = mtcars))
cor(mtcars$mpg, fitted(m0))^2
cor(mtcars$mpg, fitted(m1))^2
but that is not how R calculates R^2.
Cheers,
Josh
On Sat, Jul 28, 2012 at 10:40 AM, citynorman <citynorman at hotmail.com> wrote:
> I've just picked up R (been using Matlab, Eviews etc) and I'm having the same
> issue. Running reg=lm(ticker1~ticker2) gives R^2=50% while running
> reg=lm(ticker1~0+ticker2) gives R^2=99%!! The charts suggest the fit is
> worse not better and indeed Eviews/Excel/Matlab all say R^2=15% with
> intercept=0. How come R calculates a totally different value?!
>
> Call:
> lm(formula = ticker1 ~ ticker2)
>
> Residuals:
> Min 1Q Median 3Q Max
> -0.22441 -0.03380 0.01099 0.04891 0.16688
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 1.57062 0.08187 19.18 <2e-16 ***
> ticker2 0.61722 0.02699 22.87 <2e-16 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.07754 on 530 degrees of freedom
> Multiple R-squared: 0.4967, Adjusted R-squared: 0.4958
> F-statistic: 523.1 on 1 and 530 DF, p-value: < 2.2e-16
>
> Call:
> lm(formula = ticker1 ~ 0 + ticker2)
>
> Residuals:
> Min 1Q Median 3Q Max
> -0.270785 -0.069280 -0.007945 0.087340 0.268786
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> ticker2 1.134508 0.001441 787.2 <2e-16 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.1008 on 531 degrees of freedom
> Multiple R-squared: 0.9991, Adjusted R-squared: 0.9991
> F-statistic: 6.197e+05 on 1 and 531 DF, p-value: < 2.2e-16
>
>
> Jan private wrote
>>
>> Hi,
>>
>> thanks for your help. I'm beginning to understand things better.
>>
>>> If you plotted your data, you would realize that whether you fit the
>>> 'best' least squares model or one with a zero intercept, the fit is
>>> not going to be very good
>>> Do the data cluster tightly around the dashed line?
>> No, and that is why I asked the question. The plotted fit doesn't look
>> any better with or without intercept, so I was surprised that the
>> R-value etc. indicated an excellent regression (which I now understood
>> is the wrong interpretation).
>>
>> One of the references you googled suggests that intercepts should never
>> be omitted. Is this true even if I know that the physical reality behind
>> the numbers suggests an intercept of zero?
>>
>> Thanks,
>> Jan
>>
>> ______________________________________________
>> R-help@ mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/lm-without-intercept-tp3312429p4638204.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/
More information about the R-help
mailing list