[R] Outlier statistics question

Tue Nov 30 22:21:23 CET 2010

(Apologies to all. I am weak and could not resist)

On Tue, Nov 30, 2010 at 12:15 PM, Jahan <jahan.mohiuddin at gmail.com> wrote:
> I have a statistical question.
> The data sets I am working with are right-skewed so I have been
> plotting the log transformations of my data.  I am using a Grubbs Test
> to detect outliers in the data, but I get different outcomes depending
> on whether I run the test on the original data or the log(data).

Of course!

Here
> is one of the problematic sets:
>
> fgf2p50=c(1.563,2.161,2.529,2.726,2.442,5.047)
> stripchart(fgf2p50,vertical=TRUE)
> #This next step requires you have the 'outliers' package
> library(outliers)
> grubbs.test(fgf2p50)
> #the output says p<0.05 so 5.047 is an outlier
> #Next, I run the test on the log(data)
> log10=c(0.194,0.335,0.403,0.436,0.388,0.703)
> grubbs.test(log10)
> #output is that p>0.05 so we reject that there is an outlier.
>
> The question is, which outlier test do I accept?

Neither.

(IMHO) Outlier tests are one of statistics's _bad ideas._ The Grubbs
test is ca 1970 . There are many better approaches these days --
consult your local statistician -- all of which will depend on
answering the question,  "What is the question you are trying to
answer?"

-- Bert

>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Bert Gunter
Genentech Nonclinical Biostatistics