[R] Identifying outliers in non-normally distributed data
Brian G. Peterson
brian at braverock.com
Sun Dec 27 17:20:59 CET 2009
John wrote:
> Hello,
>
> I've been searching for a method for identify outliers for quite some
> time now. The complication is that I cannot assume that my data is
> normally distributed nor symmetrical (i.e. some distributions might
> have one longer tail) so I have not been able to find any good tests.
> The Walsh's Test (http://www.statistics4u.info/
> fundsta...liertest.html#), as I understand assumes that the data is
> symmetrical for example.
>
> Also, while I've found some interesting articles:
> http://tinyurl.com/yc7w4oq ("Missing Values, Outliers, Robust
> Statistics & Non-parametric Methods")
> I don't really know what to use.
>
> Any ideas? Any R packages available for this? Thanks!
>
> PS. My data has 1000's of observations..
Take a look at package 'robustbase', it provides most of the standard robust
measures and calculations.
While you didn't say what kind of data you're trying to identify outliers in,
if it is time series data the function Return.clean in PerformanceAnalytics may
be useful.
Regards,
- Brian
--
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock
More information about the R-help
mailing list