[R] selecting outliers

Christian Hennig chrish at stats.ucl.ac.uk
Mon Aug 8 14:45:25 CEST 2005


Hi Alessandro,

On Mon, 8 Aug 2005, alessandro carletti wrote:

> Hi everybody,
> I'd like to know if there's an easy way for extracting
> outliers record from a dataset, in order to perform
> further analysis on them.

The answer is "no". The reasons are not technical. There are some quite
easy outlier detection approaches around (e.g., compute robust Mahalanobis
distances with cov.mcd/mahalanobis and call the points with too large
distances "outliers").
But the main problem is that the term outlier has no objective, unique
meaning. It depends crucially on your aims and on the assumptions you want
to make about the non-outliers in the dataset (which should be
elliptically distributed and homogeneously close to a multivariate
normal distribution for the Mahalanobis approach).

Best,
Christian

*** NEW ADDRESS! ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche




More information about the R-help mailing list