[R] R vs. Excel (R-squared)

Petr Pikal petr.pikal at precheza.cz
Wed Jan 25 07:50:44 CET 2006


Hi

In model without intercept Rsqared is high.

Se e.g. Julian J. Faraway - Practical regression ....

Warning: R2 as defined here doesn’t make any sense if you do not have 
an intercept in your model. This
is because the denominator in the definition of R2 has a null model 
with an intercept in mind when the sum
of squares is calculated. Alternative definitions of R2 are possible 
when there is no intercept but the same
graphical intuition is not available and the R2’s obtained should not 
be compared to those for models with
an intercept. ***Beware of high R2’s reported from models without an 
intercept***.

HTH
Petr




On 24 Jan 2006 at 11:50, Lance Westerhoff wrote:

To:             	r-help at stat.math.ethz.ch
From:           	Lance Westerhoff <lance at quantumbioinc.com>
Date sent:      	Tue, 24 Jan 2006 11:50:43 -0500
Subject:        	[R] R vs. Excel (R-squared)

> Hello All-
> 
> I found an inconsistency between the R-squared reported in Excel vs. 
> that in R, and I am wondering which (if any) may be correct and if 
> this is a known issue.  While it certainly wouldn't surprise me if 
> Excel is just flat out wrong, I just want to make sure since the R-
> squared reported in R seems surprisingly high.  Please let me know if 
> this is the wrong list.  Thanks!
> 
> To begin, I have a set of data points in which the y is the  
> experimental number and x is the predicted value.  The Excel- 
> generated graph (complete with R^2 and trend line) is provided at 
> this link if you want to take a look:
> 
> http://www.quantumbioinc.com/downloads/public/excel.png
> 
> As you can see, the R-squared that is reported by Excel is -0.1005.  
> Now when I bring the same data into R, I get an R-square of +0.9331 
> (see below).  Being that I am new to R and semi-new to stats, is 
> there a difference between "multiple R-squared" and R-squared that 
> perhaps I am simply interpreting this wrong, or is this a known 
> inconsistency between the two applications?  If so, which is  correct?
>  Any insight would be greatly appreciated!
> 
> 
> ======================
> 
>  > # note: a is experimental and c is predicted
>  > summary(lm(a~c-1))
> 
> Call:
> lm(formula = a ~ c - 1)
> 
> Residuals:
>      Min      1Q  Median      3Q     Max
> -2987.6 -1126.6  -181.7   855.3  5602.8
> 
> Coefficients:
>    Estimate Std. Error t value Pr(>|t|)
> c  0.99999    0.01402   71.33   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> Residual standard error: 1423 on 365 degrees of freedom
> Multiple R-Squared: 0.9331,	Adjusted R-squared: 0.9329
> F-statistic:  5088 on 1 and 365 DF,  p-value: < 2.2e-16
> 
>  > version
>           _
> platform powerpc-apple-darwin7.9.0
> arch     powerpc
> os       darwin7.9.0
> system   powerpc, darwin7.9.0
> status
> major    2
> minor    2.1
> year     2005
> month    12
> day      20
> svn rev  36812
> language R
> 
> ======================
> 
> 
> Thank you very much for your time!
> 
> -Lance
> ____________________
> Lance M. Westerhoff, Ph.D.
> General Manager
> QuantumBio Inc.
> 
> WWW:    http://www.quantumbioinc.com
> Email:    lance at quantumbioinc.com
> 
> 
> "Safety is not the most important thing. I know this sounds like
> heresy, but it is a truth that must be embraced in order to do
> exploration. The most important thing is to actually go."  ~ James
> Cameron
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
petr.pikal at precheza.cz




More information about the R-help mailing list