[R-SIG-Finance] Winsorization

Patrick Burns patrick at burns-stat.com
Thu Sep 18 18:12:24 CEST 2008


No, quite the reverse.  Returns are unabashedly
long-tailed, which makes robustness the "right"
thing to do.  However, as Brian has said elsewhere
in this thread, a robust technique can be disastrously
worse than non-robust or silly ad hoc procedures.

Asking two questions is a good thing to do:

1) What do I want to do?
(It is surprising how often this seemingly obvious start
is slighted.)

2) Does the technique I'm using work well OUT OF
SAMPLE for THIS task?

Finance is a wild and wonderful place, and seems to have
it in for theoreticians.


Rory.WINSTON at rbs.com wrote:
> Hi Patrick
> This is interesting - by inferior, do you mean that robust methods make assumptions about the distribution shape or variance that are violated by the type of distributions seen in financial returns, for instance?
> Rory
> Rory Winston
> RBS Global Banking & Markets
> Office: +44 20 7085 4476
> -----Original Message-----
> From: r-sig-finance-bounces at stat.math.ethz.ch [mailto:r-sig-finance-bounces at stat.math.ethz.ch] On Behalf Of Patrick Burns
> Sent: 18 September 2008 11:01
> To: Ajay Shah
> Cc: r-sig-finance at stat.math.ethz.ch; ??????
> Subject: Re: [R-SIG-Finance] Winsorization
> I disagree with Ajay about the value of Winsorization.
> Yes, it is ad hoc but it is simple to understand and often results in reasonable answers.
> It certainly depends on the context but if we are talking about financial returns, then I haven't had positive experience with traditional statistical robustness.
> (Given that my thesis was on robustness, I don't say this lightly.)  Robustness often gives inferior answers in finance (in my experience) even when it is obvious that it "should" be the proper thing to do.  This is a phenomenon that I don't understand.
> The code that Ajay gives always truncates some fraction of data in each tail.  Often Winsorization is thought of as truncating only data that are too far from the center.  A simple version of this is:
> function(x, winsorize=5)
> {
>     s <- mad(x) * winsorize
>     top <- median(x) +  s
>     bot <- median(x) -  s
>     x[x > top] <- top
>     x[x < bot] <- bot
>     x
> }
> Patrick Burns
> patrick at burns-stat.com
> +44 (0)20 8525 0696
> http://www.burns-stat.com
> (home of S Poetry and "A Guide for the Unwilling S User")
> Ajay Shah wrote:
>> On Thu, Sep 18, 2008 at 11:29:19AM +0800, ?????? wrote:
>>> Dear all,
>>>        I am dealing with a data set with many outliers value. And it
>>> is said that a technique named winsorization or winsorising can
>>> reduce the influence of those extreme values. Did anyone use this
>>> skill before? And how to do it in S+ or R? Thank you.
>> Winsorisation is not a great idea. It is an adhoc procedure. Your test
>> statistics are all suspect if you have preprocessed the data in this
>> fashion.
>> If you can do robust regressions (e.g. use the R package `robust')
>> that is far better. Get on the r-sig-robust mailing list and start
>> learning! (At least, that's what I'm doing).
>> If you must do it, here's some code:
>> winsorise <- function(x, cutoff=0.01) {
>>   stopifnot(length(x)>0, cutoff>0)
>>   osd <-  sd(x)
>>   values <- quantile(x, p=c(cutoff,1-cutoff), na.rm=TRUE)
>>   winsorised.left <- x<values[1]
>>   winsorised.right <- x>values[2]       # From here on, I start writing into x
>>   x[winsorised.left] <- values[1]
>>   x[winsorised.right] <- values[2]
>>   list(winsorised=x,
>>        values=values,
>>        osd=osd, nsd=sd(x),
>>        winsorised.left=winsorised.left,
>> winsorised.right=winsorised.right)
>> }
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only.
> -- If you want to post, subscribe first.
> ***********************************************************************************
> The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. 
> Authorised and regulated by the Financial Services Authority 
> This e-mail message is confidential and for use by the 
> addressee only. If the message is received by anyone other 
> than the addressee, please return the message to the sender 
> by replying to it and then delete the message from your 
> computer. Internet e-mails are not necessarily secure. The 
> Royal Bank of Scotland plc does not accept responsibility for 
> changes made to this message after it was sent. 
> Whilst all reasonable care has been taken to avoid the 
> transmission of viruses, it is the responsibility of the recipient to 
> ensure that the onward transmission, opening or use of this 
> message and any attachments will not adversely affect its 
> systems or data. No responsibility is accepted by The 
> Royal Bank of Scotland plc in this regard and the recipient should carry 
> out such virus and other checks as it considers appropriate. 
> Visit our websites at: 
> www.rbs.com
> www.rbs.com/gbm
> www.rbsgc.com
> ***********************************************************************************

More information about the R-SIG-Finance mailing list