[R-SIG-Finance] how to winsorize data
David M Smith
david at revolution-computing.com
Wed Sep 16 19:04:25 CEST 2009
On Wed, Sep 16, 2009 at 9:33 AM, Breno Neri <breno.neri at nyu.edu> wrote:
> x <- x[ x>quantile(x, .05) & x<quantile(x, .95) ]
That will delete the extreme values from x, but if I understand the
process of winsorization correctly, the extreme values should be
*replaced* by the corresponding quantiles, no?
winsorize <- function(x, q=0.05) {
extrema <- quantile(x, c(q, 1-q))
x[x<extrema[1]] <- extrema[1]
x[x>extrema[2]] <- extrema[2]
x
}
> summary(winsorize(rnorm(100),0.05))
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.55200 -0.54590 -0.03203 -0.01133 0.54230 1.46300
> On Wed, Sep 16, 2009 at 12:24 PM, Geoffrey Smith <gps at asu.edu> wrote:
>
> > Hello, is there any kind of function that can winsorize a vector of numeric
> > data? I realize that the function mean(x, trim=...) will calculate
> > winsorized means, but I would like to winsorize the actual data. That is,
> > I
> > would like to actually change the extreme values to the 95% and 5%
> > percentile values. Thank you.
> >
> > --
> > Geoffrey Smith
> > Visiting Assistant Professor
> > Department of Finance
> > W. P. Carey School of Business
> > Arizona State University
> >
--
David M Smith <david at revolution-computing.com>
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (San Francisco, USA)
Check out our upcoming events schedule at www.revolution-computing.com/events
More information about the R-SIG-Finance
mailing list