[R] Winsorisation function

David L Carlson dcarlson at tamu.edu
Wed Nov 14 23:37:38 CET 2012


Why do you think the mean should not change when you change more of the data
values? At any rate, your function agrees with winsor.mean() in package
psych:

> library(psych)
> winsor.mean(a, trim=.05)
[1] 95.8525
> winsor.mean(a, trim=.06)
[1] 95.9064
> winsor.mean(a, trim=.07)
[1] 95.91
> winsor.mean(a, trim=.08)
[1] 95.7628

You do have a typo in this line:

x <- x[!i.na] # Should be x <- x[!is.na]

and you have a bug when k=0. It should give the same result as mean(a), but
it does not because your function adds an additional value to the end of the
data equal to the maximum.

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352



> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of dms at riseup.net
> Sent: Wednesday, November 14, 2012 3:05 PM
> To: r-help at r-project.org
> Subject: [R] Winsorisation function
> 
> 
> Dear all,
> someone  can find what I doing wrong with the following function. It is
> for winsorisation mean. At my eyes it is ok, but for reason I sometimes
> it
> is changing the results when I change the k value.
> 
> wmean <-
> function (x, na.rm = FALSE, k = 1) {
>         if (any(i.na <- is.na(x))) {
>             if (na.rm)
>                 x <- x[!i.na]
>             else return(NA)
>         }
> 		n <- length(x)
>         if (!(k %in% (0:n)))
>             stop("Invalid argument for 'k'.")
> 		x <- sort(x)
> 		x[1:k] <- x[k+1] # Here I solve the lower values
> 		x[(n-k+1):n] <- x[n-k] #Then I go over the higher ones
> 		return(mean(x))
> 	}
> 
> 
> set.seed(51)
> a<- sample(200,100)
> 
> > wmean(a, k=5)
> [1] 95.85
> > wmean(a, k=6)
> [1] 95.91
> > wmean(a, k=7)
> [1] 95.91
> > wmean(a, k=8)
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list