[R] Outlier statistics question
rvaradhan at jhmi.edu
Wed Dec 1 04:53:43 CET 2010
It is, perhaps, more apt to call the tests of outliers as "tests of outright liars".
"Lies, damned lies, and tests of outliers"
Ravi Varadhan, Ph.D.
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University
Ph. (410) 502-2619
email: rvaradhan at jhmi.edu
----- Original Message -----
From: Bert Gunter <gunter.berton at gene.com>
Date: Tuesday, November 30, 2010 4:22 pm
Subject: Re: [R] Outlier statistics question
To: Jahan <jahan.mohiuddin at gmail.com>
Cc: r-help at r-project.org
> (Apologies to all. I am weak and could not resist)
> On Tue, Nov 30, 2010 at 12:15 PM, Jahan <jahan.mohiuddin at gmail.com> wrote:
> > I have a statistical question.
> > The data sets I am working with are right-skewed so I have been
> > plotting the log transformations of my data. I am using a Grubbs Test
> > to detect outliers in the data, but I get different outcomes depending
> > on whether I run the test on the original data or the log(data).
> Of course!
> > is one of the problematic sets:
> > fgf2p50=c(1.563,2.161,2.529,2.726,2.442,5.047)
> > stripchart(fgf2p50,vertical=TRUE)
> > #This next step requires you have the 'outliers' package
> > library(outliers)
> > grubbs.test(fgf2p50)
> > #the output says p<0.05 so 5.047 is an outlier
> > #Next, I run the test on the log(data)
> > log10=c(0.194,0.335,0.403,0.436,0.388,0.703)
> > grubbs.test(log10)
> > #output is that p>0.05 so we reject that there is an outlier.
> > The question is, which outlier test do I accept?
> (IMHO) Outlier tests are one of statistics's _bad ideas._ The Grubbs
> test is ca 1970 . There are many better approaches these days --
> consult your local statistician -- all of which will depend on
> answering the question, "What is the question you are trying to
> -- Bert
> > ______________________________________________
> > R-help at r-project.org mailing list
> > PLEASE do read the posting guide
> > and provide commented, minimal, self-contained, reproducible code.
> Bert Gunter
> Genentech Nonclinical Biostatistics
> R-help at r-project.org mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help