[R] outlier
Spencer Graves
spencer.graves at pdf.com
Tue Jun 17 19:08:39 CEST 2003
It is also wise to make scatterplots, as shown by the famous examples
produced of 4 scatterplots with the same R^2, where the first shows the
standard ellipsoid pattern implied by the assumptions while the other
three indicate very clearly that the assumptions are incorrect. See
Anscombe (1973) "Graphs in Statistical Analysis", The American
Statistician, 27: 17-22, reproduced in, e.g., du Toit, Steyn and Stumpf
(1986) Graphical Exploratory Data Analysis (Springer).
hth. spencer graves
Prof Brian Ripley wrote:
> On Tue, 17 Jun 2003, kan Liu wrote:
>
>
>> I want to calculate the R-squared between two variables. Can you advice
>>me how to identify and remove the outliers before performing R-squared
>>calculation?
>
>
> Easy: you don't. It make no sense to consider R^2 after arbitrary outlier
> removal: if I remove all but two points I get R^2 = 1!
>
> R^2 is normally used to measure the success of a multiple regression, but
> as you mention two variables, did you just mean the Pearson
> product-moment correlation? It makes more sense to use a robust measure
> of correlation, as in cov.rob (package lqs) or even Spearman or Kendall
> measures (cov.test in package ctest).
>
> If you intended to do this for a multiple regression, you need to do some
> sort of robust regression and a use a robust measure of fit.
>
More information about the R-help
mailing list