[R] How to remove multiple outliers

R. Michael Weylandt michael.weylandt at gmail.com
Thu Oct 20 16:21:38 CEST 2011


Did you read the documentation for ?outlier. It clearly states that it
removes the single (possibly repeated) value with the largest distance
from the mean. That's only 10099 here....you could perhaps apply the
function more than once or write your own outlier removal script using
whatever criterion you want to define outliers, but the function is
doing exactly what it claims to do.

On another note, why complicate things? Just use the rm.outlier()
function of the same package rather than doing it (inefficiently) how
you are currently. Note that outlier() returns a logical vector which
can be used for direct subsetting; that there's no need to test
booleans ==TRUE (since that's an identity transform on the set of
booleans), and that the arr.ind = TRUE call isn't needed here. None of
those make much of a difference for this problem, but they are points
of good practice.

Michael

On Thu, Oct 20, 2011 at 8:11 AM, aajit75 <aajit75 at yahoo.co.in> wrote:
> Hi All,
>
> I am working on the dataset in which some of the variables have more than
> one observations with  outliers .
>
> I am using below mentioned sample script
>
> library(outliers)
> x1 <- c(10, 10, 11, 12, 13, 14, 14, 10, 11, 13, 12, 13, 10, 19, 18, 17,
> 10099, 10099, 10098)
> outlier_tf1 = outlier(x1,logical=TRUE)
> find_outlier1 = which(outlier_tf1==TRUE, arr.ind=TRUE)
> beh_input_ro1 = x1[-find_outlier1]
>
> It removes the outliers which are extrme and not all. In this example it
> removes only  10099, 10099 and not 10098.
>
> Thanks for the help in advance.
> -Ajit
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/How-to-remove-multiple-outliers-tp3921689p3921689.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list