[R-SIG-Finance] Getmansky et al. Smoothing Index
Peter Carl
peter at braverock.com
Sat Sep 8 04:16:30 CEST 2007
I am working on implementing a measure for evaluating the relative amount of
serial correlation caused by smoothing in a return series as described in:
Getmansky, M., A. W. Lo, and I. Makarov. “An Econometric Model of Serial
Correlation and Illiquidity in Hedge Fund Returns.” Journal of Financial
Economics 74 (2004), 529-609.
In that paper, the authors argue that there are three possible
sources of serial correlation in hedge fund returns: time-varying expected
returns, time-varying leverage and incentive fees with high-water marks.
They carefully go through all three to argue that none of these can
effectively explain the high levels of observed serial correlation in the
context of hedge funds. With that, they turn their focus towards the
combination of illiquidity and smoothed returns.
The remainder of the paper argues that serial correlation can be considered a
proxy for illiquidity and return smoothing. Even though illiquidity and
smoothing are two distinct phenomena, they argue to consider them together
since one facilitates the other. The basic arguement goes that
return-smoothing behavior yields a more consistent set of returns over time,
with lower volatility and, therefore, a higher Sharpe ratio, but it also
produces serial correlation as a side effect. Part of the motivation here is
that such a measure would give us a way to compare the relative smoothing
among our managers.
To measure and alleviate the effects of smoothing, they offer a rather
complicated solution. The first part involves estimating the smoothing
profile using maximum likelihood estimation (MLE) in a fashion similar to the
estimation of standard moving-average time series models. They define
a "smoothing profile" as a vector of coefficients for an MLE fit on returns
using a two-period moving-average process. The coefficients, θj, are then
normalized to sum to interpreted as a "weighted average of the fund’s true
returns over the most recent k + 1 periods, including the current period."
In other words, the "information generated at date t may not be fully
impounded into prices until several periods later." If the first coefficient
(θ0) was 0.719, it would imply that only 71.9% of that fund’s true current
monthly return was reported, with the remaining 28.1% distributed over the
next two months (recall the constraint that θ0 + θ1 + θ2 = 1). The estimates
0.201 and 0.080 for θ1 and θ2 imply that on average, the current reported
return also includes 20% of last months true return and 8% of the previous
month's return.
The measure probably does capture some essence of serial correlation from a
return series. If these weights are disproportionately centered on a small
number of lags, relatively little serial correlation will be induced.
However, if the weights are evenly distributed among many lags, this would
show higher serial correlation. The Herfindahl Index was originally
developed to measure concentration of manufacturers or suppliers in a
marketplace, using market share of member companies in an industry -- and has
very little to do with the measure. Getmansky, et al. simply use it to scale
the coefficients, or "smoothing profile", into a single number, or "smoothing
index". In the context of smoothed returns, a lower value of the smoothing
index implies more smoothing, and the upper bound of 1 implies no smoothing.
There are a number of issues for implementers lurking in their methodology.
The first and probably most obvious issue comes from fitting a model to the
returns series. The methodology proposed is difficult to understand and
implement correctly. Fortunately, there are functions in most popular
statistics packages that can fit such a model. There are variations in
exactly how those algorithms are implemented that may cause the results to be
difficult to repeat exactly. But, for the moment, let's pretend I found
something that comes close to their methodology to use.
In my tests, the smoothing index that I calculate is not particularly stable
through time. When measured over a 36- or 60-month rolling window, values
wiggle through regions where you might expect and then suddenly spike. Those
spikes don't mean that the manager suddenly found a pool of liquidity, or was
on vacation for a few months and couldn't smooth the returns - they mean that
the model was mis-specified and the measure isn't valid through that period.
Getmansky, et al. comment on the possibility of mis-specification, noting that
the smoothing index "does not always perform well in small samples or when
the underlying distribution of true returns is not normal as hypothesized."
They offer three tests for specification: Did the fit converge? Are all of
the estimated smoothing coefficients positive? and Is it wildly different
than the estimates from a linear regression approach (which I didn't
implement)?
The second issue is that they don't normalize the fit coeficients to [0,1], so
the resulting 'smoothing index' is not limited to that range either. As a
result, all we can say is that lower values are "less liquid" and higher
values are "more liquid" or mis-specified.
This group also wrote a second paper that updated the observations of the
first. "Systemic Risk and Hedge Funds," by Nicholas Chan, Mila Getmansky,
Shane M. Haas, and Andrew W. Lo, which was published as an NBER Working Paper
(No. 11200) in March 2005. I would note that their reported experience with
this measure seems much more consistent than mine, which suggests that the
fitting methodology I'm using is incorrect or more prone to
mis-specification.
My current draft of the code is attached below. I'm using the arima()
function to fit an MA(2) model as follows:
MA2=arima(ra, order=c(0,0,2), method="ML", transform.pars=TRUE,
include.mean=FALSE)
I'm still scratching my head about whether I'm doing this correctly. I've
noticed that the fits are very unstable through time (which makes sense,
given the normality assumption buried in here), but that would limit it's
utility. I've noticed that if I extend the model to order=c(0,0,3) it helps
some, but not a lot.
Three questions:
- Am I using the arima fit function correctly?
- Has someone else implemented this with more rigor?
- Has anyone else found this to be a useful measure?
Thanks in advance,
pcc
`SmoothingIndex` <-
function (ra, ...)
{ # @author Peter Carl
# Description:
# SmoothingIndex
# ra log return vector
# Function:
ra = checkData(ra, method="vector", na.omit=TRUE)
MA2 = NULL
thetas = 0
SmoothingIndex = 0
# First, create a a maximum likelihood estimation fit for an MA process.
# include.mean: Getmansky, et al. JFE 2004 p 555 "By applying the above
# procedure to observed de-meaned returns...", so set parameter to FALSE
# transform.pars: ibid, "we impose the additional restriction that the
# estimated MA(k) process be invertible." so set the parameter to TRUE
MA2 = arima(ra, order=c(0,0,2), method="ML", transform.pars=TRUE,
include.mean=FALSE)
thetas = as.numeric((MA2$coef)/sum(MA2$coef))
SmoothingIndex = sum(thetas^2)
return(SmoothingIndex)
}
--
Peter Carl
More information about the R-SIG-Finance
mailing list