[R] how to identify the outliers

Christian Hennig hennig at stat.math.ethz.ch
Tue Nov 26 17:48:58 CET 2002


Dear Rado,

I do not know how your data looks like, but generally you can use robust
Mahalanobis distances. That is, compute robust mean and covariance matrix
by cov.rob (method="mcd") in Library lqs, and put these as center and cov
into the function mahalanobis. As cutoff value you can take a large
quantile (say 0.999) of the chi^2-distribution with p (number of your
variables) degrees of freedom. Details in Rousseeuw & van Driessen, see
help page on cov.rob. 

Christian 

On Tue, 26 Nov 2002, Rado Bonk wrote:

> Hello R-users,
> 
> Is there any more sophisticated way how to identify the dataset 
> outliers other then seeing them in boxplot? I wanna exclude them from
> further analysis and I am interested in their position in my vector
> data.
> 
> Rado
> 
> 

-- 
***********************************************************************
Christian Hennig
Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (currently)
and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
hennig at stat.math.ethz.ch, http://stat.ethz.ch/~hennig/
hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/
#######################################################################
ich empfehle www.boag.de


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list