[RsR] Robust location estimator - an interesting application in finance
Ajay Shah
@j@y@h@h @end|ng |rom m@y|n@org
Thu Sep 24 21:57:21 CEST 2009
One interesting application of a robust location estimator is in
computing reference rates on OTC markets. Traders on an OTC market
know the ruling price but others do not. So an information agency asks
a bunch of dealers what the price is.
Dealers typically have positions on the market and have an incentive
to lie. Hence, it's useful to have a robust location estimator. The
British Bankers Association has used a `fixed trimmed mean' where the
four most extreme observations are thrown away and the average of the
remainder is used as the `reference rate' of the market. This is the
method underlying LIBOR.
A while ago, Donald Lien and John Cita suggested that it would make
more sense to experiment with a few different levels of trimming, and
pick the one where the standard deviation of the trimmed mean
(obtained through the bootstrap) is the lowest. They termed this the
`adaptive trimmed mean' or the ATM.
One advantage of the above two ideas is that they are simple to
explain to regulators and traders.
My question is: How far can contemporary knowledge in robust
statistics improve upon this scheme? If one uses robustbase::lmrob(x ~
1) and gets a location estimator, would it be much better?
Here is some data for experimentation:
load(url("http://www.mayin.org/ajayshah/tmp/all.rda"))
This gives you an object "all" which has 44 columns of data. Each of
these columns is one set of values obtained from a bunch of dealers.
I did:
library(refrate)
results <- matrix(NA, nrow=length(fileslist), ncol=4)
colnames(results) <- c("lmrob","median","atm","mean")
for (i in 1:length(fileslist)) {
tmp <- na.omit(all[,i])
a <- try(lmrob(tmp ~ 1)$coefficients)
result <- NA
if (class(a) != "try-error") {result <- a}
results[i,] <- c(result,
median(tmp),
referencerate(tmp)["atm"],
mean(tmp))
}
cor(results, use="pairwise.complete.obs")
where the function referencerate() implements the Lien/Cita scheme
described above. (I can email you this code if there is interest). I
have two findings:
(a) lmrob() often breaks. It shouldn't. I have sent in one bug report.
(b) The correlation matrix shows very high correlations:
lmrob median atm mean
lmrob 1.0000000
median 0.9998192 1.0000000
atm 0.9999741 0.9998113 1.0000000
mean 0.9993983 0.9994536 0.9996133 1.0000000
The correlations with the ATM are: lmrob > median > mean. So lmrob()
and the ATM seem to agree a lot.
Looking deeper, an important feature in this (financial) application
is that dealers should not see a location estimator where a small
cartel can produce a large distort the price. So their gains from
forming a cartel should be low. Would lmrob() be much different from
the ATM in this?
--
Ajay Shah http://www.mayin.org/ajayshah
ajayshah using mayin.org http://ajayshahblog.blogspot.com
<*(:-? - wizard who doesn't know the answer.
More information about the R-SIG-Robust
mailing list