[R-SIG-Finance] Winsorization
Patrick Burns
patrick at burns-stat.com
Thu Sep 18 18:12:24 CEST 2008
Rory,
No, quite the reverse. Returns are unabashedly
long-tailed, which makes robustness the "right"
thing to do. However, as Brian has said elsewhere
in this thread, a robust technique can be disastrously
worse than non-robust or silly ad hoc procedures.
Asking two questions is a good thing to do:
1) What do I want to do?
(It is surprising how often this seemingly obvious start
is slighted.)
2) Does the technique I'm using work well OUT OF
SAMPLE for THIS task?
Finance is a wild and wonderful place, and seems to have
it in for theoreticians.
Pat
Rory.WINSTON at rbs.com wrote:
> Hi Patrick
>
> This is interesting - by inferior, do you mean that robust methods make assumptions about the distribution shape or variance that are violated by the type of distributions seen in financial returns, for instance?
>
> Rory
>
>
> Rory Winston
> RBS Global Banking & Markets
> Office: +44 20 7085 4476
>
> -----Original Message-----
> From: r-sig-finance-bounces at stat.math.ethz.ch [mailto:r-sig-finance-bounces at stat.math.ethz.ch] On Behalf Of Patrick Burns
> Sent: 18 September 2008 11:01
> To: Ajay Shah
> Cc: r-sig-finance at stat.math.ethz.ch; ??????
> Subject: Re: [R-SIG-Finance] Winsorization
>
> I disagree with Ajay about the value of Winsorization.
> Yes, it is ad hoc but it is simple to understand and often results in reasonable answers.
>
> It certainly depends on the context but if we are talking about financial returns, then I haven't had positive experience with traditional statistical robustness.
> (Given that my thesis was on robustness, I don't say this lightly.) Robustness often gives inferior answers in finance (in my experience) even when it is obvious that it "should" be the proper thing to do. This is a phenomenon that I don't understand.
>
> The code that Ajay gives always truncates some fraction of data in each tail. Often Winsorization is thought of as truncating only data that are too far from the center. A simple version of this is:
>
>
> function(x, winsorize=5)
> {
> s <- mad(x) * winsorize
> top <- median(x) + s
> bot <- median(x) - s
> x[x > top] <- top
> x[x < bot] <- bot
> x
> }
>
> Patrick Burns
> patrick at burns-stat.com
> +44 (0)20 8525 0696
> http://www.burns-stat.com
> (home of S Poetry and "A Guide for the Unwilling S User")
>
> Ajay Shah wrote:
>
>> On Thu, Sep 18, 2008 at 11:29:19AM +0800, ?????? wrote:
>>
>>
>>> Dear all,
>>> I am dealing with a data set with many outliers value. And it
>>> is said that a technique named winsorization or winsorising can
>>> reduce the influence of those extreme values. Did anyone use this
>>> skill before? And how to do it in S+ or R? Thank you.
>>>
>>>
>> Winsorisation is not a great idea. It is an adhoc procedure. Your test
>> statistics are all suspect if you have preprocessed the data in this
>> fashion.
>>
>> If you can do robust regressions (e.g. use the R package `robust')
>> that is far better. Get on the r-sig-robust mailing list and start
>> learning! (At least, that's what I'm doing).
>>
>> If you must do it, here's some code:
>>
>> winsorise <- function(x, cutoff=0.01) {
>> stopifnot(length(x)>0, cutoff>0)
>> osd <- sd(x)
>> values <- quantile(x, p=c(cutoff,1-cutoff), na.rm=TRUE)
>> winsorised.left <- x<values[1]
>> winsorised.right <- x>values[2] # From here on, I start writing into x
>> x[winsorised.left] <- values[1]
>> x[winsorised.right] <- values[2]
>> list(winsorised=x,
>> values=values,
>> osd=osd, nsd=sd(x),
>> winsorised.left=winsorised.left,
>> winsorised.right=winsorised.right)
>> }
>>
>>
>>
>
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only.
> -- If you want to post, subscribe first.
>
> ***********************************************************************************
> The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB.
> Authorised and regulated by the Financial Services Authority
>
> This e-mail message is confidential and for use by the
> addressee only. If the message is received by anyone other
> than the addressee, please return the message to the sender
> by replying to it and then delete the message from your
> computer. Internet e-mails are not necessarily secure. The
> Royal Bank of Scotland plc does not accept responsibility for
> changes made to this message after it was sent.
>
> Whilst all reasonable care has been taken to avoid the
> transmission of viruses, it is the responsibility of the recipient to
> ensure that the onward transmission, opening or use of this
> message and any attachments will not adversely affect its
> systems or data. No responsibility is accepted by The
> Royal Bank of Scotland plc in this regard and the recipient should carry
> out such virus and other checks as it considers appropriate.
> Visit our websites at:
> www.rbs.com
> www.rbs.com/gbm
> www.rbsgc.com
> ***********************************************************************************
>
>
>
>
More information about the R-SIG-Finance
mailing list