[R] R code for to check outliers

Duncan Murdoch murdoch.duncan at gmail.com
Wed Jul 18 18:49:55 CEST 2012


On 18/07/2012 10:14 AM, Bert Gunter wrote:
> checkforoutliers <- function(series)NULL
>
> Cheers,
> Bert
>
> *Explanation: There is no such thing as a statistical outlier -- or,
> rather,"outlier" is a fraudulent statistical concept, defined arbitrarily
> and without scientific legitimacy. The typical unstated purpose of such
> identification is to remove contaminating or irrelevant data, but such a
> judgment can only be made by a subject matter expert with knowledge of the
> context and, usually, the specific cause for the unusual data. Do not be
> misled by the large body of statistical literature on this topic into
> believing that statistical analysis alone can provide objective criteria to
> do this. That is a path to scientific purgatory.
>
> For the record:
> 1. I am a statistician
> 2. Lots of highly knowledgeable, smart statisticians will condemn what I
> have just said as stupid ranting.
>
> The perils of a mailing list.

I think you are assuming that Sajeeka will handle the outliers 
incorrectly.   It happens often enough, but I don't think it's polite to 
make that assumption.

My answer to the question would have been to ask the question, "how do 
you define outliers?"  Certainly it's possible to define outliers in the 
context of a model, and their presence is an indication of problems with 
the model.  The correct response might be to weaken the assumptions of 
your model and use a robust procedure as Michael suggested (which might 
mean throwing away the outliers), or it might be to change the model in 
some other way.  Your advice to consult a subject matter expert is good, 
but in my experience, they often put more faith in their models than 
they should, so as a statistician, I think you should point out 
discrepancies like outliers.  Which means it's good to have a function 
to detect them.

Duncan Murdoch

>
> -- Bert
>
> On Wed, Jul 18, 2012 at 6:27 AM, Sajeeka Nanayakkara <nsajeeka at yahoo.com>wrote:
>
> >
> >
> >
> >
> >  What is the R code to check whether data series have outliers or not?
> >
> > Thanks,
> >
> > Sajeeka Nanayakkara
> >         [[alternative HTML version deleted]]
> >
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>



More information about the R-help mailing list