[R] selecting outliers

Christian Hennig chrish at stats.ucl.ac.uk
Mon Aug 8 15:11:19 CEST 2005


Hi,

if Soren is right, why not take a look on the identify help page?

Christian

On Mon, 8 Aug 2005, Søren Højsgaard wrote:

> Perhaps what Alessandro is after is simpler than that: Making a plot of data in a data frame, being able to click on 'suspicious points', getting the corresponding rows of a data out in a new data frame (for further inspection) while keeping the 'good points' in the plot (and perhaps redoing some calculations on the basis of the good points only....). This could then go on in an iterative way. That would be a perfectly sensible thing to do. How difficult it is technically I don't know, but it seems that it would require a call-back mechanism from a plot window to R (and a more 'advanced' one than provided by 'locator()').
>
> Best regards
> Søren
>
> -----Oprindelig meddelelse-----
> Fra: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] På vegne af Christian Hennig
> Sendt: 8. august 2005 14:45
> Til: alessandro carletti
> Cc: rHELP
> Emne: Re: [R] selecting outliers
>
> Hi Alessandro,
>
> On Mon, 8 Aug 2005, alessandro carletti wrote:
>
> > Hi everybody,
> > I'd like to know if there's an easy way for extracting outliers record
> > from a dataset, in order to perform further analysis on them.
>
> The answer is "no". The reasons are not technical. There are some quite easy outlier detection approaches around (e.g., compute robust Mahalanobis distances with cov.mcd/mahalanobis and call the points with too large distances "outliers").
> But the main problem is that the term outlier has no objective, unique meaning. It depends crucially on your aims and on the assumptions you want to make about the non-outliers in the dataset (which should be elliptically distributed and homogeneously close to a multivariate normal distribution for the Mahalanobis approach).
>
> Best,
> Christian
>
> *** NEW ADDRESS! ***
> Christian Hennig
> University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

*** NEW ADDRESS! ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche




More information about the R-help mailing list