[R-SIG-Finance] Winsorization

Rory.WINSTON at rbs.com Rory.WINSTON at rbs.com
Thu Sep 18 14:38:03 CEST 2008

Hi Patrick

This is interesting - by inferior, do you mean that robust methods make assumptions about the distribution shape or variance that are violated by the type of distributions seen in financial returns, for instance?


Rory Winston
RBS Global Banking & Markets
Office: +44 20 7085 4476

-----Original Message-----
From: r-sig-finance-bounces at stat.math.ethz.ch [mailto:r-sig-finance-bounces at stat.math.ethz.ch] On Behalf Of Patrick Burns
Sent: 18 September 2008 11:01
To: Ajay Shah
Cc: r-sig-finance at stat.math.ethz.ch; ??????
Subject: Re: [R-SIG-Finance] Winsorization

I disagree with Ajay about the value of Winsorization.
Yes, it is ad hoc but it is simple to understand and often results in reasonable answers.

It certainly depends on the context but if we are talking about financial returns, then I haven't had positive experience with traditional statistical robustness.
(Given that my thesis was on robustness, I don't say this lightly.)  Robustness often gives inferior answers in finance (in my experience) even when it is obvious that it "should" be the proper thing to do.  This is a phenomenon that I don't understand.

The code that Ajay gives always truncates some fraction of data in each tail.  Often Winsorization is thought of as truncating only data that are too far from the center.  A simple version of this is:

function(x, winsorize=5)
    s <- mad(x) * winsorize
    top <- median(x) +  s
    bot <- median(x) -  s
    x[x > top] <- top
    x[x < bot] <- bot

Patrick Burns
patrick at burns-stat.com
+44 (0)20 8525 0696
(home of S Poetry and "A Guide for the Unwilling S User")

Ajay Shah wrote:
> On Thu, Sep 18, 2008 at 11:29:19AM +0800, ?????? wrote:
>> Dear all,
>>        I am dealing with a data set with many outliers value. And it
>> is said that a technique named winsorization or winsorising can
>> reduce the influence of those extreme values. Did anyone use this
>> skill before? And how to do it in S+ or R? Thank you.
> Winsorisation is not a great idea. It is an adhoc procedure. Your test
> statistics are all suspect if you have preprocessed the data in this
> fashion.
> If you can do robust regressions (e.g. use the R package `robust')
> that is far better. Get on the r-sig-robust mailing list and start
> learning! (At least, that's what I'm doing).
> If you must do it, here's some code:
> winsorise <- function(x, cutoff=0.01) {
>   stopifnot(length(x)>0, cutoff>0)
>   osd <-  sd(x)
>   values <- quantile(x, p=c(cutoff,1-cutoff), na.rm=TRUE)
>   winsorised.left <- x<values[1]
>   winsorised.right <- x>values[2]       # From here on, I start writing into x
>   x[winsorised.left] <- values[1]
>   x[winsorised.right] <- values[2]
>   list(winsorised=x,
>        values=values,
>        osd=osd, nsd=sd(x),
>        winsorised.left=winsorised.left,
> winsorised.right=winsorised.right)
> }

R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only.
-- If you want to post, subscribe first.

The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. 
Authorised and regulated by the Financial Services Authority 

This e-mail message is confidential and for use by the=2...{{dropped:22}}

More information about the R-SIG-Finance mailing list