[R] Pierce's criterion

Greg Snow 538280 at gmail.com
Thu Apr 19 22:12:43 CEST 2012


Determining what is an outlier is complicated regardless of the tools
used (this is a philosophical issue rather than an R issue).  You need
to make some assumptions and definitions based on the science that
produces the data rather than the data itself before even approaching
the question of outliers.  What is an outlier for a normal
distribution may be reasonable from a gamma distribution and
completely expected from a cauchy distribution.

See the 'outliers' dataset in the TeachingDemos package, and more
importantly the examples in the help page for it, for a demonstration
of the perils of automatic outlier deletion.

On Wed, Apr 18, 2012 at 4:11 PM, Ryan Murphy <rmurphy4 at u.rochester.edu> wrote:
> Hello all,
>
> I would like to rigorously test whether observations in my dataset are
> outliers.  I guess all the main tests in R (Grubbs) impose the assumption
> of normality.  My data is surely not normal, so I would like to use
> something else.  As far as I can tell from wikipedia, Peirce's criterion is
> just that.
>
> The data I am interested in testing is: 1) Continuous on the unit interval
> 2) Discrete 3) Ordinal on 0 6.  If you need more specifics, (1) refers to
> the gini index of inequality, (2) refers to measures for the number of
> assasinations, strikes, etc in a country, (3) refers to ranking data of how
> politically free a country is.
>
> Does R do this test?
>
> Thanks a lot, and PS I unlike many economists prefer R over Stata R >>>>>
> Stata!
>
> Sincerely,
> Ryan Murphy
>
> --
> Ryan Murphy
> 2012
> B.A. Economics and Mathematics
> 339-223-4181
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538280 at gmail.com



More information about the R-help mailing list