[R-SIG-Finance] Outliers in the market model that's used to estimate `beta' of a stock
markleeds at verizon.net
markleeds at verizon.net
Thu Sep 18 18:58:02 CEST 2008
Hi: i don't know if you read "fooled by randomness" by Nassim Taleb (
spelling ) but he essentially says using very non statistical arguments
but
strong nevertheless. ( it's not a stat or a quant finance book ) that
outliers in finance are not modellable and don't claim that you can
model
them because you'd be lying. In fact, he would say that a model works
until it doesn't.
Anyway, it's an interesting book that sort of indirectly talks ( for a
little too long actually. you can get what's he saying in the first 50
pages and
it's about 200 pages ) about your comment below so I figured I would
just mention it in case you were interested.
On Thu, Sep 18, 2008 at 11:36 AM, Ajay Shah wrote:
> In continuation of the discussion on `Winsorisation' that has taken
> place on r-sig-finance today, I thought I'd present all of you with an
> interesting dataset and a question.
>
> This data is the daily stock returns of the large Indian software firm
> `Infosys'. (This is the symbol `INFY' on NASDAQ). It is a large number
> of observations of daily returns (i.e. percentage changes of the
> adjusted stock price).
>
> Load the data in --
>
>
> print(load(url("http://www.mayin.org/ajayshah/tmp/infosys_mm.rda")))
> str(x)
> summary(x)
> sd(x)
>
> The name `rj' is used for returns on Infosys, and `rM' is used for
> returns on the stock market index (Nifty). There are three really
> weird observations in this.
>
> weird.rj <- c(1896,2395)
> weird.rM <- 2672
> x[weird.rj,]
> x[weird.rM,]
>
> As you can see, these observations are quite remarkable given the
> small standard deviations that we saw above. There is absolutely no
> measurement error here. These things actually happened.
>
> Now consider a typical application: using this to estimate a market
> model. The goal here is to estimate the coefficient of a regression of
> rj on rM.
>
> # A regression with all obs
> summary(lm(rj ~ rM, data=x))
>
> # Drop the weird rj --
> summary(lm(rj ~ rM, data=x[-weird.rj,]))
>
> # Drop the weird rM --
> summary(lm(rj ~ rM, data=x[-weird.rM,]))
>
> # Drop both kinds of weird observations --
> summary(lm(rj ~ rM, data=x[-c(weird.rM,weird.rj),]))
>
> # Robust regressions
> library(MASS)
> summary(rlm(rj ~ rM, data=x))
> summary(rlm(rj ~ rM, method="MM", data=x))
> library(robust)
> summary(lmRob(rj ~ rM, data=x))
> library(quantreg)
> summary(rq(rj ~ rM, tau=0.5, data=x))
>
> So you see, we have a variety of different estimates for the slope
> (which is termed `beta' in finance). What value would you trust the
> most?
>
> And, would winsorisation using either my code
> (https://stat.ethz.ch/pipermail/r-sig-finance/2008q3/002921.html) or
> Patrick Burns' code
> (https://stat.ethz.ch/pipermail/r-sig-finance/2008q3/002923.html) be a
> good idea here?
>
> I'm instinctively unhappy with any scheme based on discarding
> observations that I'm absolutely sure have no measurement error. We
> have to model the weirdness of this data generating process, not
> ignore it.
>
> --
> Ajay Shah
> http://www.mayin.org/ajayshah ajayshah at mayin.org
> http://ajayshahblog.blogspot.com
> <*(:-? - wizard who doesn't know the answer.
>
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only.
> -- If you want to post, subscribe first.
More information about the R-SIG-Finance
mailing list