[R-SIG-Finance] performance analytics functions missing from other libraries (implemented)
Brian G. Peterson
brian at braverock.com
Sat Dec 9 16:48:43 CET 2006
We previously released our functions for dealing with higher moments of
the return distribution, including coskewness, cokurtosis, and Modified
Cornish-Fisher VaR. That release post may be found here:
https://stat.ethz.ch/pipermail/r-sig-finance/2006q3/000977.html
We would now like to release a collection of assorted performance
analytics functions, filling in gaps that we found when working with the
excellent and extremely useful RMetrics, 'portfolio', and other
finance-focused R packages.
The following functions are useful for calculating performance metrics for
trading strategies, fund performance, or other post-trade (even
post-balance- sheet) performance assessment. In general, we're working
with monthly scaled returns data (although these functions should be
useful for daily or weekly scales as well), and in returns rather than
prices.
There are books and books full of these calculations. Some are even
useful. Notation terminology in the papers and texts is often sloppy, so
the user should be aware of how the specific calculations are
implemented.
Functions:
checkDataMatrix
checkDataVector
annualizedReturn
annualizedStdDev
sharpeRatio
annualizedSharpeRatio
modSharpe
semiDeviation
downsideDeviation
sortinoRatio
maxDrawdown
beta
alpha
omega
trackingError
activePremium
informationRatio
rollingFunction
rollingStat
cumulativeReturns
colCumprods
colCumMax
rollingCorrelation
statsTable
@todo Herfindahl Index
@todo Geltner method risk-adjusted returns
@todo longest run up/down
@todo improve the way statsTable() uses apply() to make it more generic
Default input values assume regular monthly returns. Scale arguements can
be used to specify the number of observations during a year (e.g., 12 =
monthly returns).
The risk free rate (rf) MUST be in the same periodicity as the data going
in.
checkDataMatrix
checkDataVector
Description:
These functions were created to make the different kinds of data
at least _seem_ more fungible. It allows the user to pass in a data
object without being concerned that the function requires a matrix,
data.frame, or timeSeries object.
By using this, the function "knows"
what data format it has to work with.
annualizedReturn
Description:
An average annualized return is convenient for comparing returns.
annualizedStdDev
To annualize standard deviation, we multiply by the square root
of the number of observations per year.
sharpeRatio
Description:
The Sharpe ratio is simply the return per unit of risk
(represented by variability).
This function uses returns rather than prices as is
done in other R libraries.
annualizedSharpeRatio
Description:
Using an annualized Sharpe Ratio is useful for comparison.
The annualized Sharpe ratio is computed by dividing the
annualized mean monthly excess return by the
annualized monthly standard deviation of excess return.
modSharpe
Description:
The Sharpe ratio is simply the return per unit of risk
(represented by variance).
The higher the Sharpe ratio, the better the combined
performance of "risk" and return.
The Sharpe Ratio is a risk-adjusted measure of return that uses
standard deviation to represent risk.
A number of papers now recommend using a "modified Sharpe" ratio
using a Modified Cornish-Fisher VaR as the measure of Risk.
semiDeviation
Description:
This function should just be a wrapper of downsideDeviation with
MAR = mean(x)
see below
downsideDeviation
Description:
Downside deviation, similar to semi deviation, eliminates
positive returns when calculating risk.
To calculate it, we take the returns that are less
than the target (or Minimum Acceptable Returns (MAR))
returns and take the differences of those to the target.
We sum the squares and divide by the total number
of returns to get a below-target semi-variance.
sortinoRatio
Description:
Sortino proposed to better account for skill and excess performance
by using only downside semivariance as the measure of risk.
modified by permission from function by Sankalp Upadhyay
maxDrawdown = function (R)
Description:
To find the maximum drawdown in a return series, we need to first
calculate the cumulative returns and the maximum cumulative return
to that point.
Any time the cumulative returns dips below the maximum
cumulative returns, it's a drawdown. Drawdowns are measured as a
percentage of that maximum cumulative return, in effect,
measured from peak equity.
alpha
This is a wrapper for calculating a CAPM alpha.
beta
Description:
This is a wrapper for calculating a CAPM beta.
omega
Description
Keating and Shadwick (2002) proposed Omega
(referred to as Gamma in their original paper)
as a way to capture all of the higher moments of the
returns distribution. Mathematically, Omega is:
integral[L to b](1 - F(r))dr / integral[a to L](F(r))dr
where the cumulative distribution F is defined
on the interval (a,b).
L is the loss threshold that can be specified as zero, return from a
benchmark index, or
an absolute rate of return - any specified level.
When comparing alternatives using Omega, L should be common.
Input data can be transformed prior to calculation,
which may be useful for introducing risk aversion.
trackingError
Description
trackingError = sqrt(sum(assetReturns.vec - indexReturns.vec)^2 /
(length(assetReturns.vec) - 1)) * sqrt(scale)
activePremium
Description
activePremium = (annualizedReturn(assetReturns.vec, scale = scale) -
annualizedReturn(indexReturns.vec, scale = scale) )
informationRatio
Description
informationRatio = activePremium/trackingError
rollingStat
function for calculating any function over a rolling period,
say the last three months. Pass in the function name as FUN,
e.g., FUN = "mean". Useful for passing in several periods and
functions
for comparing different series.
cumulativeReturn
Simple function for calculating cumulative return over a period
of time, say a calendar year.
Can produce simple or geometric return.
colCumprods
apply cumulative product to the return strem, wrapped using apply()
colCummax
to get to drawdown calculations, we need colCummax
statsTable
Description:
This function creates a table of statistics from vectors
of functions and labels passed in.
The resulting table is formatted such that metrics
are calculated separately for each column of returns
in the data object.
rollingCorrelation
Description:
This is a wrapper for providing n-period trailing correlations
for the data provided.
Inspired by rollFun() written by Diethelm Wurtz.
Specifically, we're wrapping the basic function cor(), which has:
function (x, y = NULL, use = "all.obs", method = c("pearson",
"kendall", "spearman"))
rollingFunction
Description:
This is a wrapper for providing n-period trailing calculations
for the data and functions provided.
rollingRegression
Description:
This code was posted to the R-help mailing list in response to a
question
posted by Ajay Shah. For the full discussion, see:
http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg19544.html
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License (which
includes attribution for the original creator)
The GNU General Public License is available here:
http://www.gnu.org/licenses/gpl.html
The current and all future versions of the attached file
may be found here:
http://braverock.com/brian/R/performance-analytics.R
We hope you find it useful. Any comments, criticisms, or suggestions
for improvement are gladly accepted.
Regards,
- Brian
-------------- next part --------------
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either
# version 2 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Library General Public License for more details.
#
# You should have received a copy of the GNU General
# Public License along with this library; if not,
# go here: http://www.gnu.org/licenses/gpl.html
# or write to the
# Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
# MA 02111-1307 USA
# Copyright 2006 Peter Carl, Brian G. Peterson
# $Id: performance-analytics.R,v 1.36 2006/12/09 15:26:08 brian Exp $
################################################################################
# FUNCTIONS:
#
# checkDataMatrix
# checkDataVector
# annualizedReturn
# annualizedStdDev
# sharpeRatio
# annualizedSharpeRatio
# modSharpe
# semiDeviation
# downsideDeviation
# sortinoRatio
# maxDrawdown
# beta
# alpha
# omega
# trackingError
# activePremium
# informationRatio
# rollingFunction
# rollingStat
# cumulativeReturns
# colCumprods
# colCumMax
# rollingCorrelation
# statsTable
#
# @todo Herfindahl Index
# @todo Geltner method risk-adjusted returns
# @todo longest run up/down
# @todo improve the way statsTable() fn uses apply() to make it more generic
#
################################################################################
################################################################################
# The following functions are useful for calculating performance metrics for
# trading strategies, fund performance, or other post-trade (even post-balance-
# sheet) performance assessment. In general, we're working with monthly scaled
# data (although these functions should be useful for daily or weekly scales as
# well), and in returns rather than prices.
# There are books and books full of these calculations. Some are even useful.
# Notation is sloppy, however, so the user should be aware of how the specific
# calculations are implemented.
# Default input values assume regular monthly returns. Scale arguements can be
# used to specify the number of observations during a year (e.g., 12 = monthly
# returns).
# The risk free rate (rf) MUST be in the same periodicity as the data going in.
# ------------------------------------------------------------------------------
# Load any required libraries.
if(!require("fCalendar", quietly=TRUE)) {
stop("package", sQuote("fOptions"), "is needed. Stopping")
}
if(!require("fOptions", quietly=TRUE)) {
stop("package", sQuote("fOptions"), "is needed. Stopping")
}
if(!require("tseries", quietly=TRUE)) {
stop("package", sQuote("tseries"), "is needed. Stopping")
}
if(!require("Hmisc", quietly=TRUE)) {
stop("package", sQuote("Hmisc"), "is needed. Stopping")
}
# also requires Brian's extra_moments.R file, should we always include it?
# ------------------------------------------------------------------------------
checkDataMatrix = function (x, na.rm = FALSE, ...)
{ # @author Peter Carl
# Description:
# This function was created to make the different kinds of data classes
# at least _seem_ more fungible. It allows the user to pass in a data
# object without being concerned that the function requires a matrix,
# data.frame, or timeSeries object. By using this, the function "knows"
# what data format it has to work with.
# Inputs:
# x: a matrix, data.frame or timeSeries object of data to be checked and
# transformed if necessary.
# na.rm: removes rows of data with NA's in any column. This is useful
# for calculating correlations, etc., that require complete data. Note
# that this passes back only the first contiguous set of data, to retain
# the regular timeseries nature of the data.
# Outputs:
# Produces a matrix object with named rows and columns.
# FUNCTION:
if (class(x) == "numeric") {
# Note that column and row labels will be blank.
y = as.matrix(x)
}
if (class(x) == "matrix") {
y = x
}
if (class(x) == "data.frame") {
y = as.matrix(x)
}
if (class(x) == "timeSeries") {
# timeSeries objects keep the data in a matrix and append a set of
# meta-data. We just need the information stored in the 'Data' slot.
#y = x at Data
y = seriesData(x)
}
if (na.rm) {
#y = na.contiguous(y)
y = na.omit(y)
}
return(y)
}
# ------------------------------------------------------------------------------
checkDataVector = function (R, na.rm = TRUE, quiet = TRUE, ...)
{ # @author Peter Carl
# Description:
# This function was created to make the different kinds of data classes
# at least _seem_ more fungible. It allows the user to pass in a data
# object without being concerned that the function requires a matrix,
# data.frame, or timeSeries object. By using this, the function "knows"
# what data format it has to work with.
# Inputs:
# x: the data provided.
# na.rm: default behavior is to remove NA's
# quiet: if true, it will throw warnings when errors are noticed
# Outputs:
# Produces a numeric vector or an NA and any associated warnings.
# FUNCTION:
x = as.vector(R)
# First, we'll check to see if we have more than one column.
if (NCOL(x) > 1) {
if(!quiet)
warning("The data provided is not a vector or univariate time series. Used only the first column")
x = x[,1]
}
# Second, we'll hunt for NA's and remove them if required
if (any(is.na(x))) {
if(na.rm) {
# Try to remove any NA's
x = x[!is.na(x)]
if(!quiet){
warning("The following slots have NAs.")
warning(paste(x at na.removed," "))
}
}
else {
if(!quiet)
warning("Data contains NA's.")
}
}
# Third, we'll check to see if we have any character data
if (!is.numeric(x)){
if(!quiet)
warning("The data is not numeric.")
# Try to coerce the data
x = as.numeric(x)
}
# Fourth, we'll see if we have more than one data point.
if (NROW(x) <= 1) {
if(!quiet)
warning("Only one row provided. Returning NA.")
return(NA)
}
# @todo: Add check for stopifnot(is.atomic(y))???
return(as.vector(x))
}
# ------------------------------------------------------------------------------
annualizedReturn = function (R, geometric = TRUE, scale = 12)
{ # @author Peter Carl
# Description:
# An average annualized return is convenient for comparing returns.
# @todo: This function could be used for calculating geometric or simple
# returns
# n = periods under analysis
# f = number of periods in a year (daily f = 252, monthly f = 12,
# quarterly f = 4)
# arithmetic average: ra = (f/n) * sum(ri)
# geometric average: rg = product(1 + ri)^(f/n) - 1
# @todo: don't calculate for returns less than 1 year
# FUNCTION:
x = checkDataVector(R)
y = x[!is.na(x)]
n = length(y)
# currently only uses geometric return to annualize returns
#return(prod(1 + x)^(scale/length(x)) - 1)
return(prod(1 + x, na.rm = TRUE)^(scale/n) - 1)
}
# ------------------------------------------------------------------------------
annualizedStdDev = function (R, rf = 0, scale = 12)
{ # @author Peter Carl
# To annualize standard deviation, we multiply by the square root of the
# number of observations per year.
x = checkDataVector(R)
# return(sqrt(scale)*sqrt(var(x - rf)))
return(sqrt(scale)*sqrt(var(x, na.rm = TRUE)))
}
# ------------------------------------------------------------------------------
sharpeRatio = function (R, rf = 0)
{ # @author Peter Carl
# DESCRIPTION:
# The Sharpe ratio is simply the return per unit of risk (represented by
# variability). The higher the Sharpe ratio, the better the combined
# performance of "risk" and return.
# The Sharpe Ratio is a risk-adjusted measure of return that uses
# standard deviation to represent risk.
# Inputs:
# R: in this case, the function anticipates having a return stream as input,
# rather than prices.
# rf: the risk free rate MUST be in the same periodicity as the data going in.
# Outputs:
# This function returns a Sharpe ratio for the same periodicity of the
# data being input (e.g., monthly data -> monthly SR)
# FUNCTION:
x = checkDataVector(R)
return((mean(x, na.rm = TRUE) - rf)/sd(x, na.rm = TRUE))
#return((mean(x) - rf)/sd(x))
}
# ------------------------------------------------------------------------------
annualizedSharpeRatio = function (R, rf = 0, scale = 12)
{ # @author Peter Carl
# DESCRIPTION:
# Using an annualized Sharpe Ratio is useful for comparison. The annualized
# Sharpe ratio is computed by dividing the annualized mean monthly excess
# return by the annualized monthly standard deviation of excess return.
# @todo: monthly standard deviation of ***excess*** return
# Inputs:
# R: in this case, the function anticipates having a return stream as input,
# rather than prices.
# rf: the risk free rate MUST be in the same periodicity as the data going in.
# FUNCTION:
x = checkDataVector(R)
return((annualizedReturn(x, scale = scale) - (rf * scale))/annualizedStdDev(x, rf = rf, scale = scale))
}
# ------------------------------------------------------------------------------
modSharpe = function (R, rf = 0, p=0.95, scale=1)
{ # @author Brian G. Peterson
# DESCRIPTION:
# The Sharpe ratio is simply the return per unit of risk (represented by
# variability). The higher the Sharpe ratio, the better the combined
# performance of "risk" and return.
# The Sharpe Ratio is a risk-adjusted measure of return that uses
# standard deviation to represent risk.
# A number of papers now recommend using a "modified Sharpe" ratio
# using a Modified Cornish-Fisher VaR as the measure of Risk.
# Inputs:
# R: in this case, the function anticipates having a return stream as input,
# rather than prices.
#
# rf: the risk free rate MUST be in the same periodicity as the data going in.
#
# p: probability at which to calculate the modified VaR (defaults to 95%)
# Outputs:
# This function returns a modified Sharpe ratio for the same periodicity of the
# data being input (e.g., monthly data -> monthly SR)
# FUNCTION:
x = checkDataVector(R)
return( (mean(x, na.rm = TRUE) - rf)*scale/(modifiedVaR(x, p)*sqrt(scale)) )
#return((mean(x) - rf)/sd(x))
}
# ------------------------------------------------------------------------------
semiDeviation = function (R)
{ # @author Peter Carl
# DESCRIPTION:
# This function should just be a wrapper of downsideDeviation with
# MAR = mean(x)
# see below
# FUNCTION:
x = checkDataVector(R)
y = subset(x,x < mean(x))
return(sqrt(sum((y - mean(x))^2)/(length(y)-1)))
}
# ------------------------------------------------------------------------------
downsideDeviation = function (R, MAR = 0)
{ # @author Peter Carl
# DESCRIPTION:
# Downside deviation, similar to semi deviation, eliminates positive returns
# when calculating risk. To calculate it, we take the returns that are less
# than the target (or Minimum Acceptable Returns (MAR)) returns and take the
# differences of those to the target. We sum the squares and divide by the
# total number of returns to get a below-target semi-variance.
# This is also useful for calculating semi-deviation by setting
# MAR = mean(x)
# FUNCTION:
x = checkDataVector(R)
y = subset(x,x < MAR)
# return(sqrt(sum((y - MAR)^2)/(length(y)-1)))
return(sqrt(sum((y - MAR)^2)/(length(y))))
# @todo verify which calculation is correct
}
# ------------------------------------------------------------------------------
sortinoRatio = function (R, MAR = 0)
{ # @author Brian G. Peterson
# modified from function by Sankalp Upadhyay <sankalp.upadhyay [at] gmail [dot] com> with permission
# Description:
# Sortino proposed to better account for skill and excess performance
# by using only downside semivariance as the measure of risk.
# R return vector
# MAR minimum acceptable return
# Function:
x = checkDataVector(R)
result = mean (R - MAR) / downsideDeviation(R, MAR)
#Return:
result
}
# ------------------------------------------------------------------------------
maxDrawdown = function (R)
{ # @author Peter Carl
# DESCRIPTION:
# To find the maximum drawdown in a return series, we need to first
# calculate the cumulative returns and the maximum cumulative return to
# that point. Any time the cumulative returns dips below the maximum
# cumulative returns, it's a drawdown. Drawdowns are measured as a
# percentage of that maximum cumulative return, in effect, measured from
# peak equity.
# FUNCTION:
x = checkDataVector(R)
cumulativeReturn = cumprod(1+x)
maxCumulativeReturn = cummax(cumulativeReturn)
drawdown = cumulativeReturn/maxCumulativeReturn - 1
# if you want to see the drawdown series, plot(drawdown,type="l")
return(min(drawdown))
}
# ------------------------------------------------------------------------------
alpha = function (x, y, rf = 0)
{ # @author Peter Carl
# DESCRIPTION:
# This is a wrapper for calculating a CAPM alpha.
# Inputs:
# x: vector of returns for the asset being tested
# y: vector of returns for the benchmark the asset is being gauged against
# x and y are assumed to be matching periods
# rf: risk free rate in the same periodicity as the returns. May be a vector
# of the same length as x and y.
# Output:
#
# FUNCTION:
assetReturns.vec = checkDataVector(x)
indexReturns.vec = checkDataVector(y)
if (length(assetReturns.vec) != length(indexReturns.vec))
stop("Returns to be assessed have unequal time periods. Are there NA's in the data?")
# Make these excess returns
assetExcessRet.vec = assetReturns.vec - rf
indexExcessRet.vec = indexReturns.vec - rf
# regress
model.lm = lm(assetExcessRet.vec ~ indexExcessRet.vec)
alpha = coef(model.lm)[[1]]
# beta = coef(model.lm)[[2]]
alpha
}
# ------------------------------------------------------------------------------
beta = function (x, y, rf = 0, digits = 4)
{ # @author Peter Carl
# DESCRIPTION:
# This is a wrapper for calculating a CAPM beta.
# Inputs:
# x: vector of returns for the asset being tested
# y: vector of returns for the benchmark the asset is being gauged against
# x and y are assumed to be matching periods and NO TEST IS MADE TO CHECK
# rf: risk free rate in the same periodicity as the returns. May be a vector
# of the same length as x and y.
# Output:
#
# FUNCTION:
assetReturns.vec = checkDataVector(x)
indexReturns.vec = checkDataVector(y)
if (length(assetReturns.vec) != length(indexReturns.vec))
stop("Returns to be assessed have unequal time periods. Are there NA's in the data?")
# Make these excess returns
assetExcessRet.vec = assetReturns.vec - rf
indexExcessRet.vec = indexReturns.vec - rf
# regress
model.lm = lm(assetExcessRet.vec ~ indexExcessRet.vec)
# alpha = coef(model.lm)[[1]]
beta = coef(model.lm)[[2]]
beta
}
# ------------------------------------------------------------------------------
omega = function (R, L = 0, method = c("simple", "interp", "binomial", "blackscholes"), output = c("point", "full"), rf = 0)
{ # @author Peter Carl
# DESCRIPTION
# Keating and Shadwick (2002) proposed Omega (referred to as Gamma in their
# original paper) as a way to capture all of the higher moments of the
# returns distribution. Mathematically, Omega is:
# integral[L to b](1 - F(r))dr / integral[a to L](F(r))dr
# where the cumulative distribution F is defined on the interval (a,b).
# L is the loss threshold that can be specified as zero, return from a
# benchmark index, or an absolute rate of return - any specified level.
# When comparing alternatives using Omega, L should be common. Input data
# can be transformed prior to calculation, which may be useful for
# introducing risk aversion.
# This function returns a vector of Omega, useful for plotting. The
# steeper, the less risky. Above it's mean, a steeply sloped Omega also
# implies a very limited potential for further gain.
# Omega has a value of 1 at the mean of the distribution.
# Omega is sub-additive. The ratio is dimensionless.
# Kazemi, Schneeweis, and Gupta (2003), in "Omega as a Performance Measure"
# shows that Omega can be written as:
# Omega(L) = C(L)/P(L)
# where C(L) is essentially the price of a European call option written
# on the investment and P(L) is essentially the price of a European put
# option written on the investment. The maturity for both options is
# one period (e.g., one month) and L is the strike price of both options.
# The numerator and the denominator can be expressed as:
# exp(-rf) * E[max(x - L, 0)]
# exp(-rf) * E[max(L - x, 0)]
# with exp(-rf) calculating the present values of the two, where rf is
# the per-period riskless rate.
# The first three methods implemented here focus on that observation.
# The first method takes the simplification described above. The second
# uses the Black-Scholes option pricing as implemented in fOptions. The
# third uses the binomial pricing model from fOptions. The second and
# third methods are not implemented here.
# The fourth method, "interp", creates a linear interpolation of the cdf of
# returns, calculates Omega as a vector, and finally interpolates a function
# for Omega as a function of L.
# FUNCTION
x = checkDataVector(R)
if(method == "simple") {
numerator = exp(-rf) * mean(max(x - L, 0))
denominator = exp(-rf) * mean(max(L - x, 0))
omega = numerator/denominator
}
if(method == "interp") {
#
a = min(x)
b = max(x)
xcdf = ecdf(x)
f <- approxfun(xcdf$x,xcdf$y,method="linear",ties="ordered")
if(output == "full") {
omega = cumsum(1-f(xcdf$x))/cumsum(f(xcdf$x))
}
else {
# returns only the point value for L
# to get a point measure for omega, have to interpolate
omegafull = cumsum(1-f(xcdf$x))/cumsum(f(xcdf$x)) # ????????
g <- approxfun(xcdf$x,omegafull,method="linear",ties="ordered")
omega = g(L)
}
}
result = omega
result
}
# ------------------------------------------------------------------------------
trackingError = function (R, Ri, scale = 12)
{ # @author Peter Carl
# DESCRIPTION
# trackingError = sqrt(sum(assetReturns.vec - indexReturns.vec)^2 /
# (length(assetReturns.vec) - 1)) * sqrt(scale)
# Inputs:
# Outputs:
# FUNCTION
assetReturns.vec = checkDataVector(R)
indexReturns.vec = checkDataVector(Ri)
trackingError = sqrt(sum(assetReturns.vec - indexReturns.vec)^2 / (length(assetReturns.vec) - 1)) * sqrt(scale)
trackingError
}
# ------------------------------------------------------------------------------
activePremium = function (R, Ri, scale = 12)
{ # @author Peter Carl
# DESCRIPTION
# activePremium = (annualizedReturn(assetReturns.vec, scale = scale) -
# annualizedReturn(indexReturns.vec, scale = scale) )
# Inputs:
# Outputs:
# FUNCTION
assetReturns.vec = checkDataVector(R)
indexReturns.vec = checkDataVector(Ri)
activePremium = (annualizedReturn(assetReturns.vec, scale = scale) - annualizedReturn(indexReturns.vec, scale = scale))
activePremium
}
# ------------------------------------------------------------------------------
informationRatio = function (R, Ri, scale = 12)
{ # @author Peter Carl
# DESCRIPTION
# informationRatio = activePremium/trackingError
# Inputs:
# Outputs:
# FUNCTION
assetReturns.vec = checkDataVector(R)
indexReturns.vec = checkDataVector(Ri)
activePremium = activePremium(assetReturns.vec,indexReturns.vec, scale = scale)
trackingError = trackingError(assetReturns.vec,indexReturns.vec, scale = scale)
informationRatio = activePremium/trackingError
informationRatio
}
# ------------------------------------------------------------------------------
rollingStat = function (R, period = 3, trim = TRUE, FUN, ...)
{ # @author Peter Carl
# DESCRIPTION
# Function to apply any time series function over a rolling period,
# say the last three months. Pass in the function name as FUN,
# e.g., FUN = "mean". Useful for passing in several periods and functions
# for comparing different series.
# Inputs:
# period the number of periods over which to calculate the statistic.
# Setting the period to 0 calculates the statistic "from inception" or
# using all of the data passed in.
# FUN function to apply over e.g., FUN = "mean" or FUN = "cumulativeReturns"
# Output:
# FUNCTION
x = checkDataVector(R)
if (length(x) < period){
# @todo: instead of stopping, should this warn and return NAs for length(x)?
stop("Data set is not long enough. Reduce the evaluation period or provide a longer time period of data. Also, check the data for NA's.")
}
if (period == 0) {
y = as.matrix(x[1:length(x)])
}
else
y = as.matrix(x[(length(x)- period + 1):length(x)])
answer=apply(y, MARGIN = 2, FUN = FUN, ...)
return(answer)
# Example includes how to pass in arguements to the function - must be in order:
# > rollingStat(gg.ts at Data[,1],period=12,FUN="annualizedSharpeRatio",rf=.03/12)
# [1] 1.476426
# > rollingStat(gg.ts at Data[,1],period=3,FUN="annualizedSharpeRatio",rf=.03/12)
# [1] 6.804358
# > rollingStat(gg.ts at Data[,1],period=3,FUN="annualizedSharpeRatio",rf=0)
# [1] 8.253612
}
# ------------------------------------------------------------------------------
cumulativeReturn = function (R, geometric = TRUE)
{ # @author Peter Carl
# This is a useful function for calculating cumulative return over a period
# of time, say a calendar year. Can produce simple or geometric return.
x = checkDataVector(R)
if (!geometric)
return(sum(x, na.rm = TRUE))
else {
return(prod(1+x, na.rm = TRUE)-1)
}
}
# ------------------------------------------------------------------------------
colCumprods = function (R, na.rm = TRUE, ...)
{ # @author Peter Carl
# if we do this, then cumulating a set of monthly returns is easy
x = checkDataMatrix(R)
if (na.rm) {
result = apply(na.omit(x), MARGIN = 2, FUN = cumprod,
...)
}
else {
result = apply(x, MARGIN = 2, FUN = cumprod, ...)
}
result
}
# ------------------------------------------------------------------------------
colCummax = function (R, na.rm = TRUE, ...)
{ # @author Peter Carl
# to get to drawdown calculations, we need colCummax
x = checkDataMatrix(R)
if (na.rm) {
result = apply(na.omit(x), MARGIN = 2, FUN = cummax,
...)
}
else {
result = apply(x, MARGIN = 2, FUN = cummax, ...)
}
result
}
# ------------------------------------------------------------------------------
statsTable = function (R, metrics=c("mean","sd"), metricsNames=c("Average Return","Standard Deviation"), ...)
{ # @author Peter Carl
# DESCRIPTION:
# This function creates a table of statistics from vectors of functions and
# labels passed in. The resulting table is formatted such that metrics
# are calculated separately for each column of returns in the data object.
# Inputs:
# Assumes an input of period returns. Scale arguements can be used to
# specify the number of observations during a year (e.g., 12 = monthly
# returns).
# Outputs:
# A table with calculated metrics for each column
# The idea here is to be able to pass in sets of metrics and values, like:
# metrics = c(downsideDeviation(x,MAR=mean(x)), sd(subset(x,x>0)),
# sd(subset(x,x<0)), downsideDeviation(x,MAR=MAR),
# downsideDeviation(x,MAR=rf), downsideDeviation(x,MAR=0),maxDrawdown(x))
# metricsNames = c("Semi Deviation", "Gain Deviation", "Loss Deviation",
# paste("Downside Deviation (MAR=",MAR*scale*100,"%)", sep=""),
# paste("Downside Deviation (rf=",rf*scale*100,"%)", sep=""),
# paste("Downside Deviation (0%)", sep=""), "Maximum Drawdown" )
# Here's how it's working right now:
# > statsTable(monthlyReturns.ts,metrics=c("modifiedVaR","mean"),
# metricsNames=c("modVaR","mean"),p=.95)
# Actual S&P500TR
# modVaR 0.04186461 0.06261451
# mean 0.00945000 0.01013684
# Passing in attributes doesn't quite work. The issue is apparent in:
# > statsTable(monthlyReturns.ts,metrics=c("modifiedVaR", "modifiedVaR"),
# metricsNames=c("Modified VaR","Traditional VaR"), modified=c(TRUE,FALSE))
# Actual S&P500TR
# Modified VaR 0.06340849 0.09334976
# Traditional VaR 0.06340849 0.09334976
# Warning messages:
# 1: the condition has length > 1 and only the first element will be used in: if (modified) {
# 2: the condition has length > 1 and only the first element will be used in: if (modified) {
# 3: the condition has length > 1 and only the first element will be used in: if (modified) {
# 4: the condition has length > 1 and only the first element will be used in: if (modified) {
# FUNCTION:
y = checkDataMatrix(R)
# for each column, do the following:
for(column in 1:columns) {
# x = as.vector(y[,column])
x = as.matrix(y[,column])
x.length = length(x)
x = x[!is.na(x)]
x.na = x.length - length(x)
# apply the calculations
values = vector('numeric', 0)
for(metric in metrics) {
# values = c(values,metric)
# I'm not quite sure why this requires the as.matrix() coersion
newvalue = apply(as.matrix(x), MARGIN = 2, FUN = metric, ...)
values = c(values,newvalue)
}
if(column == 1) {
resultingtable = data.frame(Value = values, row.names = metricsNames)
}
else {
nextcolumn = data.frame(Value = values, row.names = metricsNames)
resultingtable = cbind(resultingtable, nextcolumn)
}
}
colnames(resultingtable) = columnnames
resultingtable
}
# ------------------------------------------------------------------------------
rollingCorrelation = function (x, y, n, trim = TRUE, na.rm = FALSE, ...)
{# @author Peter Carl
# DESCRIPTION:
# This is a wrapper for providing n-period trailing correlations for the
# data provided. Inspired by rollFun() written by Diethelm Wurtz.
# Specifically, we're wrapping the basic function cor(), which has:
# function (x, y = NULL, use = "all.obs", method = c("pearson",
# "kendall", "spearman"))
# Inputs:
# Assumes an input of regular period returns.
# x: a numeric vector, matrix, data frame, or timeSeries.
# The resulting data frame will contain columns of cor(x,y)
# for each period.
#
# y: 'NULL' (default) or a vector, matrix, data frame or timeSeries
# with compatible dimensions to 'x'.
#
# Currently assumes that BOTH x and y are provided
# @todo: Make it so we can _either_ give a matrix or data frame
# for 'x' _or_ give both 'x' and 'y' (as in cov())
#
# Assumes that x and y are sequenced exactly the same and are regular.
# Outputs:
# A data.table of n-period trailing correlations for each column in y.
# FUNCTION:
# target.vec is the vector of data we want correlations for; we'll get it
# from x
target.vec = checkDataVector(x)
xrows = nrow(x)
rownames = rownames(x)
# data.matrix is a vector or matrix of data we want correlations against;
# we'll take it from y
data.matrix = checkDataMatrix(y)
columns=ncol(y)
columnnames = colnames(y)
if (!trim) {
result.df = as.data.frame(matrix(data = NA, nrow = (n-1), ncol = columns, byrow = FALSE, dimnames = NULL))
colnames(result.df) = columnnames
}
# For each period (or row):
for(row in n:xrows) {
values = vector('numeric', 0)
# Get the subset of returns on which to calulate correlation
trailing.data = data.matrix[(row-n+1):row,]
trailing.vec = target.vec[(row-n+1):row]
# Calculate correlation
values = cor(trailing.vec,trailing.data,...)
if(row == n && trim) {
result.df = data.frame(Value = values, row.names = rownames[n])
}
else {
nextrow = data.frame(Value = values, row.names = rownames[row])
colnames(nextrow) = columnnames
result.df = rbind(result.df, nextrow)
}
}
if (!trim) {
rownames(result.df) = rownames
}
else
rownames(result.df) = rownames(n:xrows)
colnames(result.df) = columnnames
result.df
# Example:
# > head(rollingCorrelation(manager.ts at Data[,1],edhec.ts at Data,n=12))
# Convertible Arbitrage CTA Global Distressed Securities
# 2003-11-28 0.2591101 0.2762218 0.7516556
# 2003-12-31 0.2162078 0.2477113 0.7452179
# 2004-01-30 0.3918575 0.3489062 0.7562063
# 2004-02-27 0.5331404 0.3905645 0.7088004
# 2004-03-31 0.5730389 0.3010877 0.5694478
# 2004-04-30 0.5146946 0.3762283 0.4374524
# Emerging Markets Equity Market Neutral Event Driven
# 2003-11-28 0.6678183 0.4133534 0.6872263
# 2003-12-31 0.7739188 0.3758590 0.8494740
# 2004-01-30 0.7805586 0.4148372 0.8338275
# 2004-02-27 0.7353874 0.3444552 0.8181069
# 2004-03-31 0.7072259 0.2740104 0.6270411
# 2004-04-30 0.4430823 0.3918143 0.4966760
# Fixed Income Arbitrage Funds of Funds Global Macro Long/Short Equity
# 2003-11-28 0.6681346 0.7413791 0.5698372 0.4429661
# 2003-12-31 0.6583804 0.7705968 0.5647109 0.6949394
# 2004-01-30 0.7116529 0.7427369 0.5972492 0.6849833
# 2004-02-27 0.7580839 0.6831231 0.6555057 0.6300653
# 2004-03-31 0.7057303 0.6172516 0.6253257 0.5233462
# 2004-04-30 0.7583853 0.5305704 0.6740996 0.3518156
# Merger Arbitrage Relative Value Short Selling
# 2003-11-28 0.6890493 0.6858798 -0.23238159
# 2003-12-31 0.7397387 0.8062422 -0.41123820
# 2004-01-30 0.7219990 0.7883262 -0.36321115
# 2004-02-27 0.6862517 0.7605837 -0.28807086
# 2004-03-31 0.5921739 0.6766063 -0.24154960
# 2004-04-30 0.5073817 0.5257858 -0.04663322
}
# ------------------------------------------------------------------------------
rollingFunction = function (R, n, trim = TRUE, na.rm = TRUE, firstcolumn = 1, digits = 4, rf = 0, FUN = "mean", ...)
{# @author Peter Carl
# DESCRIPTION:
# This is a wrapper for providing n-period trailing calculations for the
# data and functions provided.
# Inputs:
# R: a matrix, data frame, or timeSeries of regular period returns.
# n: the number of periods over which a function is to be calculated. Use
# the value zero (0) to roll the statistic from inception
# FUN: the function to apply to the data
# trim:
# na.rm:
# firstcolumn:
# digits: number of decimal places to round the display to.
# Outputs:
# A data.table of n-period trailing calculations for each column
# in x.
# Inspired by rollFun() written by Diethelm Wurtz. We've extended the idea
# to all the columns provided.
# FUNCTION:
data.mat = checkDataMatrix(R)
columns=ncol(data.mat)
# @todo: remove NA's before setting column names
columnnames = colnames(data.mat)
# @todo: handle empty column names
if(is.null(columnnames))
stop("Column names are empty. If you are trying to pass in a timeSeries, use seriesData() rather than the explicit @Data slot.")
rows = nrow(data.mat)
rownames = rownames(data.mat)
if(rows < n)
stop("Data set is too short. Select a shorter evaluation period or provide a longer time period. Also, check the data for NA's.")
# for each column, do the following:
for(column in firstcolumn:columns) {
valueNames = vector('character', 0)
values = vector('numeric', 0)
x = checkDataVector(data.mat[,column])
if(n == 0) {
period = 0
n = 12
}
else
period = n
for(row in n:rows) {
subperiod = x[1:row]
values = c(values,rollingStat(subperiod, period = period, FUN = FUN, ...))
valueNames = c(valueNames,rownames[row])
}
if (!trim) {
values = c(rep(NA, (n - 1)), values)
valueNames = c(rownames[1:n-1],valueNames)
}
if(column == 1) {
resultingtable = data.frame(Value = values, row.names = valueNames)
}
else {
nextcolumn = data.frame(Value = values, row.names = valueNames)
resultingtable = cbind(resultingtable, nextcolumn)
}
}
colnames(resultingtable) = columnnames
result = round(resultingtable, digits)
result
# Examples:
# > rollingFunction(gg.ts[,1],n=3,FUN="annualizedReturn")
# Manager
# 2002-02-28 0.0306
# 2002-03-31 0.0521
# 2002-04-30 0.0387
# ...
# > rollingFunction(gg.ts[,1],n=3,trim=FALSE,FUN="annualizedReturn")
# Manager
# 2001-12-31 NA
# 2002-01-31 NA
# 2002-02-28 0.0306
# 2002-03-31 0.0521
# 2002-04-30 0.0387
# ...
# > rollingFunction(gg.ts[,1],n=3,trim=FALSE,FUN="annualizedSharpeRatio")
# Manager
# 2001-12-31 NA
# 2002-01-31 NA
# 2002-02-28 1.5302
# 2002-03-31 4.3768
# 2002-04-30 6.9640
# ...
# > rollingFunction(gg.ts[,1],n=3,trim=FALSE,FUN="annualizedSharpeRatio",rf=.03/12)
# Manager
# 2001-12-31 NA
# 2002-01-31 NA
# 2002-02-28 0.0298
# 2002-03-31 1.8587
# 2002-04-30 1.5598
# ...
}
# ------------------------------------------------------------------------------
rollingRegression <- function(formula, data, width, ...)
{# @author Douglas Bates
# @author modified by Peter Carl
# DESCRIPTION:
# This code was posted to the R-help mailing list in response to a question
# posted by Ajay Shah. For the full discussion, see:
# http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg19544.html
# @todo: make inputs consistant with other functions
mCall = match.call()
mCall$width = NULL
mCall[[1]] = as.name("lm")
mCall$x = mCall$y = TRUE # now mCall contains lm(y = TRUE, x = TRUE)
bigfit = eval(mCall, parent.frame())
ncoef = length(coef(bigfit))
nr = nrow(data)
width = as.integer(width)[1]
stopifnot(width >= ncoef, width <= nr)
y = bigfit$y
x = bigfit$x
terms = bigfit$terms
inds = embed(seq(nr), width)[, rev(seq(width))]
sumrys <- lapply(seq(nrow(inds)),
function(st) {
ind = inds[st,]
fit = lm.fit(x[ind, , drop = FALSE], y[ind])
fit$terms = terms
class(fit) = "lm"
summary(fit)
})
list(coefficients = sapply(sumrys, function(sm) coef(sm)[,"Estimate"]),
Std.Error = sapply(sumrys, function(sm) coef(sm)[,"Std. Error"]),
sigma = sapply(sumrys, "[[", "sigma"),
r.squared = sapply(sumrys, "[[", "r.squared"))
}
# ------------------------------------------------------------------------------
################################################################################
# $Log: performance-analytics.R,v $
# Revision 1.36 2006/12/09 15:26:08 brian
# - clean up function comments some more in advance of public release
# Bug 128
#
# Revision 1.35 2006/12/06 15:07:41 brian
# - add todo items for Herfindahl, Geltner, and runs
# - remove stub notation on Sortino
# Bug 128
#
# Revision 1.34 2006/12/06 03:39:08 brian
# - add Sortino function modified from Sankalp Upadhyay
#
# Revision 1.33 2006/12/05 21:52:32 brian
# - add modSharpe fn
# - update function list to show all functions in file
# - update copyright
# - fix indentation
# Bug 128
#
# Revision 1.32 2006/11/17 21:18:18 peter
# - added rf option to rollingFunction
#
# Revision 1.31 2006/11/07 03:36:12 peter
# - added informationRatio, trackingError, and activePremium function
#
# Revision 1.30 2006/10/31 03:58:11 peter
# - added function for rollingRegression
#
# Revision 1.29 2006/10/19 14:58:13 peter
# - fixed an issue with single-point calculation in rollingFunction when using n=0
#
# Revision 1.28 2006/10/19 13:45:53 peter
# - added "trim" option to rollingCorrelation
#
# Revision 1.27 2006/10/19 02:00:21 peter
# - rollingFunction with n=0 will trim first 12 observations
#
# Revision 1.26 2006/10/19 01:36:41 peter
# - rollingFunction now calculates statistics "from inception"
#
# Revision 1.25 2006/10/17 04:36:35 peter
# - minor modifications to rollingCorrelations function
#
# Revision 1.24 2006/10/10 04:39:48 peter
# - reverted back from na.contiguous to na.omit; not working for single column in matrix
#
# Revision 1.23 2006/10/09 13:59:37 peter
# - replaced na.omit with na.contiguous to retain time series nature of data in
# checkDataMatrix when na.rm = T
#
# Revision 1.22 2006/10/09 13:53:56 peter
# - added na.omit to dataCheckMatrix to ensure complete data sets for particular
# calculations
#
# Revision 1.21 2006/10/07 03:24:42 peter
# - improved NA handling in checkDataVector
# - added NA removal to some calculations, no longer needed
#
# Revision 1.20 2006/10/05 01:13:25 peter
# - rollingStat now passes arguements to functions with parameters
# - added a trim option to rollingFunction for lining up data in charts
#
# Revision 1.19 2006/10/04 14:13:55 peter
# - minor cleanup on RollingFunction
#
# Revision 1.18 2006/10/03 19:19:45 peter
# - fixed checkDataMatrix for the timeSeries case with missing column names
#
# Revision 1.17 2006/09/30 03:46:44 peter
# - fixed typo in annualizedSharpeRatio()
#
# Revision 1.16 2006/09/30 03:42:51 peter
# - finished applying data checks
# - fixed rollingFunction
#
# Revision 1.15 2006/09/30 00:54:21 peter
# - added checkDataVector() and applied to most calculations
#
# Revision 1.14 2006/09/29 14:19:19 peter
# - added fCalendar as an explicit required library
#
# Revision 1.13 2006/09/29 14:18:04 peter
# - fixed matrix case for checkDataMatrix
#
# Revision 1.12 2006/09/29 14:13:38 peter
# - added checkDataMatrix() to make data objects appear more fungible
#
# Revision 1.11 2006/09/28 14:53:40 peter
# - debugging rollingCorrelations(); x currently needs to be a matrix
#
# Revision 1.10 2006/09/26 21:56:55 brian
# - replace loud require directives with quiet requires
#
# Revision 1.9 2006/09/20 19:13:16 peter
# - repaired AnnualizedReturn call
#
# Revision 1.8 2006/09/20 19:08:54 peter
# - fixed annualizedSharpeRatio() calculation
#
# Revision 1.7 2006/09/20 17:08:44 peter
# Added rollingFunction() to calculate single-input metrics on a rolling period basis.
# Currently not working.
#
# Revision 1.6 2006/09/12 13:37:53 peter
# - snapshot 2006-09-11
#
# Revision 1.4 2006/09/12 13:09:50 brian
# - added omega function
# - snapshot 2006-08-26 v2
#
# Revision 1.3 2006/09/12 13:06:32 peter
# - snapshot 2006-08-23
#
# Revision 1.2 2006/09/12 12:56:39 peter
# - second revision snapshot 2006-08-17
#
# Revision 1.1 2006-09-12 07:46:18 peter
# - initial revision snapshot 2006-08-16
# Bug 128
################################################################################
More information about the R-SIG-Finance
mailing list