[R] implementing Grubbs outlier test on a large dataframe
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Sun Feb 15 01:23:52 CET 2009
John Malone wrote:
> Hi!
>
> I'm trying to implement an outlier test once/row in a large dataframe.
> Ideally, I'd do this then add the Pvalue results and the number flagged as
> an outlier as two new separate columns to the dataframe. Grubbs outlier
> test requires a vector and I'm confused how to make each row of my dataframe
> a vector, followed by doing a Grubbs test for each row containing the vector
> of numbers I want to perform the outlier test on.
>
> I'm new to R and no doubt this is a simple problem. Any help you might
> provide would be greatly appreciated.
>
> Many thanks in advance!!
>
> [[alternative HTML version deleted]]
>
John - you would be making a strong normality assumption. You might
reject H0 using Grubbs' test just because of non-normality, or you might
fail to reject it just because of non-normality. Is it really this
straitforward to declare something an outlier? What does outlier really
mean?
The following is must reading.
@Article{fin06cal,
author = {Finney, David J.},
title = {Calibration guidelines challenge outlier
practices},
journal = The American Statistician,
year = 2006,
volume = 60,
pages = {309-313},
annote = {anticoagulant
therapy;bias;causation;ethics;objectivity;outliers;guidelines for
treatment of outliers;overview of types of outliers;letter to the editor
and reply 61:187 May 2007}
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list