[R-SIG-Finance] how to winsorize data

David M Smith david at revolution-computing.com
Wed Sep 16 19:04:25 CEST 2009


On Wed, Sep 16, 2009 at 9:33 AM, Breno Neri <breno.neri at nyu.edu> wrote:
> x <- x[ x>quantile(x, .05) & x<quantile(x, .95) ]

That will delete the extreme values from x, but if I understand the
process of winsorization correctly, the extreme values should be
*replaced* by the corresponding quantiles, no?

winsorize <- function(x, q=0.05) {
 extrema <- quantile(x, c(q, 1-q))	
 x[x<extrema[1]] <- extrema[1]
 x[x>extrema[2]] <- extrema[2]
 x
}

> summary(winsorize(rnorm(100),0.05))
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
-1.55200 -0.54590 -0.03203 -0.01133  0.54230  1.46300

> On Wed, Sep 16, 2009 at 12:24 PM, Geoffrey Smith <gps at asu.edu> wrote:
>
> > Hello, is there any kind of function that can winsorize a vector of numeric
> > data?  I realize that the function mean(x, trim=...) will calculate
> > winsorized means, but I would like to winsorize the actual data.  That is,
> > I
> > would like to actually change the extreme values to the 95% and 5%
> > percentile values.  Thank you.
> >
> > --
> > Geoffrey Smith
> > Visiting Assistant Professor
> > Department of Finance
> > W. P. Carey School of Business
> > Arizona State University
> >

--
David M Smith <david at revolution-computing.com>
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (San Francisco, USA)

Check out our upcoming events schedule at www.revolution-computing.com/events



More information about the R-SIG-Finance mailing list