[R] outlier

Spencer Graves spencer.graves at pdf.com
Tue Jun 17 19:08:39 CEST 2003


	  It is also wise to make scatterplots, as shown by the famous examples 
produced of 4 scatterplots with the same R^2, where the first shows the 
standard ellipsoid pattern implied by the assumptions while the other 
three indicate very clearly that the assumptions are incorrect.  See 
Anscombe (1973) "Graphs in Statistical Analysis", The American 
Statistician, 27: 17-22, reproduced in, e.g., du Toit, Steyn and Stumpf 
(1986) Graphical Exploratory Data Analysis (Springer).

hth.  spencer graves

Prof Brian Ripley wrote:
> On Tue, 17 Jun 2003, kan Liu wrote:
> 
> 
>> I want to calculate the R-squared between two variables. Can you advice
>>me how to identify and remove the outliers before performing R-squared
>>calculation?
> 
> 
> Easy: you don't.  It make no sense to consider R^2 after arbitrary outlier 
> removal: if I remove all but two points I get R^2 = 1!
> 
> R^2 is normally used to measure the success of a multiple regression, but 
> as you mention two variables, did you just mean the Pearson 
> product-moment correlation?  It makes more sense to use a robust measure 
> of correlation, as in cov.rob (package lqs) or even Spearman or Kendall 
> measures (cov.test in package ctest).
> 
> If you intended to do this for a multiple regression, you need to do some 
> sort of robust regression and a use a robust measure of fit.
>




More information about the R-help mailing list