[R] Identifying outliers in non-normally distributed data

Brian G. Peterson brian at braverock.com
Sun Dec 27 17:20:59 CET 2009


John wrote:
> Hello,
> 
> I've been searching for a method for identify outliers for quite some
> time now. The complication is that I cannot assume that my data is
> normally distributed nor symmetrical (i.e. some distributions might
> have one longer tail) so I have not been able to find any good tests.
> The Walsh's Test (http://www.statistics4u.info/
> fundsta...liertest.html#), as I understand assumes that the data is
> symmetrical for example.
> 
> Also, while I've found some interesting articles:
> http://tinyurl.com/yc7w4oq ("Missing Values, Outliers, Robust
> Statistics & Non-parametric Methods")
> I don't really know what to use.
> 
> Any ideas? Any R packages available for this? Thanks!
> 
> PS. My data has 1000's of observations..

Take a look at package 'robustbase', it provides most of the standard robust 
measures and calculations.

While you didn't say what kind of data you're trying to identify outliers in, 
if it is time series data the function Return.clean in PerformanceAnalytics may 
be useful.

Regards,

   - Brian


-- 
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock




More information about the R-help mailing list