[RsR] Outliers in the market model that's used to estimate `beta' of a stock
Eva Cantoni
Ev@@C@nton| @end|ng |rom un|ge@ch
Thu Sep 18 22:48:01 CEST 2008
As a complement of this discussion, I would like to bring to your
attention a paper by Marc Genton and Elvezio Ronchetti entitled "Robust
Prediction of Beta", available from the webpage of M. Genton
http://www.unige.ch/ses/metri/genton/publications.html
Abstract:
The estimation of "beta" plays a basic role in the evaluation of
expected return and market risk. Typically this is performed by ordinary
least squares (OLS). To cope with the high sensitivity of OLS to
outlying observations and to deviations from the normality assumptions,
several methods suggest to use robust estimators. It is argued that,
from a predictive point of view, the simple use of either OLS or
robust estimators is not sufficient but that some shrinking of the
robust estimators toward OLS is necessary to reduce the mean squared
error. The performance of the proposed shrinkage robust estimator is
shown by means of a small simulation study and on a real data set.
Best regards,
Eva
Ajay Shah wrote:
> In continuation of the discussion on `Winsorisation' that has taken
> place on r-sig-finance today, I thought I'd present all of you with an
> interesting dataset and a question.
>
> This data is the daily stock returns of the large Indian software firm
> `Infosys'. (This is the symbol `INFY' on NASDAQ). It is a large number
> of observations of daily returns (i.e. percentage changes of the
> adjusted stock price).
>
> Load the data in --
>
> print(load(url("http://www.mayin.org/ajayshah/tmp/infosys_mm.rda")))
> str(x)
> summary(x)
> sd(x)
>
> The name `rj' is used for returns on Infosys, and `rM' is used for
> returns on the stock market index (Nifty). There are three really
> weird observations in this.
>
> weird.rj <- c(1896,2395)
> weird.rM <- 2672
> x[weird.rj,]
> x[weird.rM,]
>
> As you can see, these observations are quite remarkable given the
> small standard deviations that we saw above. There is absolutely no
> measurement error here. These things actually happened.
>
> Now consider a typical application: using this to estimate a market
> model. The goal here is to estimate the coefficient of a regression of
> rj on rM.
>
> # A regression with all obs
> summary(lm(rj ~ rM, data=x))
>
> # Drop the weird rj --
> summary(lm(rj ~ rM, data=x[-weird.rj,]))
>
> # Drop the weird rM --
> summary(lm(rj ~ rM, data=x[-weird.rM,]))
>
> # Drop both kinds of weird observations --
> summary(lm(rj ~ rM, data=x[-c(weird.rM,weird.rj),]))
>
> # Robust regressions
> library(MASS)
> summary(rlm(rj ~ rM, data=x))
> summary(rlm(rj ~ rM, method="MM", data=x))
> library(robust)
> summary(lmRob(rj ~ rM, data=x))
> library(quantreg)
> summary(rq(rj ~ rM, tau=0.5, data=x))
>
> So you see, we have a variety of different estimates for the slope
> (which is termed `beta' in finance). What value would you trust the
> most?
>
> And, would winsorisation using either my code
> (https://stat.ethz.ch/pipermail/r-sig-finance/2008q3/002921.html) or
> Patrick Burns' code
> (https://stat.ethz.ch/pipermail/r-sig-finance/2008q3/002923.html) be a
> good idea here?
>
> I'm instinctively unhappy with any scheme based on discarding
> observations that I'm absolutely sure have no measurement error. We
> have to model the weirdness of this data generating process, not
> ignore it.
>
More information about the R-SIG-Robust
mailing list