[R] how to identify the outliers
Christian Hennig
hennig at stat.math.ethz.ch
Tue Nov 26 17:48:58 CET 2002
Dear Rado,
I do not know how your data looks like, but generally you can use robust
Mahalanobis distances. That is, compute robust mean and covariance matrix
by cov.rob (method="mcd") in Library lqs, and put these as center and cov
into the function mahalanobis. As cutoff value you can take a large
quantile (say 0.999) of the chi^2-distribution with p (number of your
variables) degrees of freedom. Details in Rousseeuw & van Driessen, see
help page on cov.rob.
Christian
On Tue, 26 Nov 2002, Rado Bonk wrote:
> Hello R-users,
>
> Is there any more sophisticated way how to identify the dataset
> outliers other then seeing them in boxplot? I wanna exclude them from
> further analysis and I am interested in their position in my vector
> data.
>
> Rado
>
>
--
***********************************************************************
Christian Hennig
Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (currently)
and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
hennig at stat.math.ethz.ch, http://stat.ethz.ch/~hennig/
hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/
#######################################################################
ich empfehle www.boag.de
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list