[R-SIG-Finance] performance analytics functions missing from other libraries (implemented)

Brian G. Peterson brian at braverock.com
Sat Dec 9 16:48:43 CET 2006


We previously released our functions for dealing with higher moments of 
the return distribution, including coskewness, cokurtosis, and Modified 
Cornish-Fisher VaR.  That release post may be found here:

https://stat.ethz.ch/pipermail/r-sig-finance/2006q3/000977.html

We would now like to release a collection of assorted performance 
analytics functions, filling in gaps that we found when working with the 
excellent and extremely useful RMetrics, 'portfolio', and other 
finance-focused R packages.  

The following functions are useful for calculating performance metrics for 
trading strategies, fund performance, or other post-trade (even 
post-balance- sheet) performance assessment.  In general, we're working 
with monthly scaled returns data (although these functions should be 
useful for daily or weekly scales as well), and in returns rather than 
prices.

There are books and books full of these calculations.  Some are even 
useful. Notation terminology in the papers and texts is often sloppy, so 
the user should be aware of how the specific calculations are 
implemented.

Functions:
  checkDataMatrix
  checkDataVector
  annualizedReturn
  annualizedStdDev
  sharpeRatio
  annualizedSharpeRatio
  modSharpe
  semiDeviation
  downsideDeviation
  sortinoRatio
  maxDrawdown
  beta
  alpha
  omega
  trackingError
  activePremium
  informationRatio
  rollingFunction
  rollingStat
  cumulativeReturns
  colCumprods
  colCumMax
  rollingCorrelation
  statsTable
  @todo Herfindahl Index
  @todo Geltner method risk-adjusted returns
  @todo longest run up/down
  @todo improve the way statsTable() uses apply() to make it more generic

Default input values assume regular monthly returns.  Scale arguements can 
be used to specify the number of observations during a year (e.g., 12 = 
monthly returns).

The risk free rate (rf) MUST be in the same periodicity as the data going 
in.

checkDataMatrix
checkDataVector
      Description:
      These functions were created to make the different kinds of data
      at least _seem_ more fungible.  It allows the user to pass in a data
      object without being concerned that the function requires a matrix,
      data.frame, or timeSeries object.  
      By using this, the function "knows"
      what data format it has to work with.

annualizedReturn
      Description:
      An average annualized return is convenient for comparing returns.

annualizedStdDev
      To annualize standard deviation, we multiply by the square root 
      of the number of observations per year.

sharpeRatio
      Description:
      The Sharpe ratio is simply the return per unit of risk 
      (represented by variability).  
      This function uses returns rather than prices as is 
      done in other R libraries.

annualizedSharpeRatio
      Description:
      Using an annualized Sharpe Ratio is useful for comparison.  
      The annualized Sharpe ratio is computed by dividing the 
      annualized mean monthly excess return by the 
      annualized monthly standard deviation of excess return.

modSharpe
      Description:
      The Sharpe ratio is simply the return per unit of risk 
      (represented by variance).  
      The higher the Sharpe ratio, the better the combined
      performance of "risk" and return.

      The Sharpe Ratio is a risk-adjusted measure of return that uses
      standard deviation to represent risk.

      A number of papers now recommend using a "modified Sharpe" ratio
      using a Modified Cornish-Fisher VaR as the measure of Risk.

semiDeviation
      Description:
      This function should just be a wrapper of downsideDeviation with
      MAR = mean(x)
      see below

downsideDeviation
      Description:
      Downside deviation, similar to semi deviation, eliminates 
      positive returns when calculating risk.  
      To calculate it, we take the returns that are less
      than the target (or Minimum Acceptable Returns (MAR)) 
      returns and take the differences of those to the target.  
      We sum the squares and divide by the total number 
      of returns to get a below-target semi-variance.

sortinoRatio
      Description:
      Sortino proposed to better account for skill and excess performance
      by using only downside semivariance as the measure of risk.
      modified by permission from function by Sankalp Upadhyay

maxDrawdown = function (R)
      Description:
      To find the maximum drawdown in a return series, we need to first
      calculate the cumulative returns and the maximum cumulative return
      to that point.  
      Any time the cumulative returns dips below the maximum
      cumulative returns, it's a drawdown.  Drawdowns are measured as a
      percentage of that maximum cumulative return, in effect, 
      measured from peak equity.

alpha
      This is a wrapper for calculating a CAPM alpha.

beta
      Description:
      This is a wrapper for calculating a CAPM beta.

omega
      Description
      Keating and Shadwick (2002) proposed Omega 
      (referred to as Gamma in their original paper) 
      as a way to capture all of the higher moments of the
      returns distribution.  Mathematically, Omega is:
        integral[L to b](1 - F(r))dr / integral[a to L](F(r))dr
      where the cumulative distribution F is defined 
      on the interval (a,b).
      L is the loss threshold that can be specified as zero, return from a
      benchmark index, or 
         an absolute rate of return - any specified level.
      When comparing alternatives using Omega, L should be common.  
      Input data can be transformed prior to calculation, 
      which may be useful for introducing risk aversion.

trackingError
      Description
      trackingError = sqrt(sum(assetReturns.vec - indexReturns.vec)^2 /
                        (length(assetReturns.vec) - 1)) * sqrt(scale)

activePremium
      Description
      activePremium = (annualizedReturn(assetReturns.vec, scale = scale) -
                       annualizedReturn(indexReturns.vec, scale = scale) )

informationRatio
      Description
      informationRatio = activePremium/trackingError

rollingStat
      function for calculating any function over a rolling period,
      say the last three months.  Pass in the function name as FUN,
      e.g., FUN = "mean".  Useful for passing in several periods and 
functions
      for comparing different series.

cumulativeReturn
      Simple function for calculating cumulative return over a period
      of time, say a calendar year.  
      Can produce simple or geometric return.

colCumprods
      apply cumulative product to the return strem, wrapped using apply()

colCummax
      to get to drawdown calculations, we need colCummax

statsTable
      Description:
      This function creates a table of statistics from vectors 
      of functions and labels passed in. 
      The resulting table is formatted such that metrics
      are calculated separately for each column of returns 
      in the data object.


rollingCorrelation
      Description:
      This is a wrapper for providing n-period trailing correlations 
      for the data provided.  
      Inspired by rollFun() written by Diethelm Wurtz.

      Specifically, we're wrapping the basic function cor(), which has:
      function (x, y = NULL, use = "all.obs", method = c("pearson",
          "kendall", "spearman"))


rollingFunction
      Description:
      This is a wrapper for providing n-period trailing calculations 
      for the data and functions provided.

rollingRegression
      Description:
      This code was posted to the R-help mailing list in response to a 
question
      posted by Ajay Shah.  For the full discussion, see:
      http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg19544.html


This library is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License (which 
includes attribution for the original creator)

The GNU General Public License is available here:
http://www.gnu.org/licenses/gpl.html

The current and all future versions of the attached file 
may be found here:
http://braverock.com/brian/R/performance-analytics.R

We hope you find it useful.  Any comments, criticisms, or suggestions
for improvement are gladly accepted.

Regards,

    - Brian

-------------- next part --------------
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either
# version 2 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Library General Public License for more details.
#
# You should have received a copy of the GNU General
# Public License along with this library; if not,
# go here: http://www.gnu.org/licenses/gpl.html
# or write to the
# Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
# MA  02111-1307  USA

# Copyright 2006 Peter Carl, Brian G. Peterson
# $Id: performance-analytics.R,v 1.36 2006/12/09 15:26:08 brian Exp $

################################################################################
# FUNCTIONS:
#
#   checkDataMatrix
#   checkDataVector
#   annualizedReturn
#   annualizedStdDev
#   sharpeRatio
#   annualizedSharpeRatio
#   modSharpe
#   semiDeviation
#   downsideDeviation
#   sortinoRatio
#   maxDrawdown
#   beta
#   alpha
#   omega
#   trackingError
#   activePremium
#   informationRatio
#   rollingFunction
#   rollingStat
#   cumulativeReturns
#   colCumprods
#   colCumMax
#   rollingCorrelation
#   statsTable
#
#   @todo Herfindahl Index
#   @todo Geltner method risk-adjusted returns
#   @todo longest run up/down
#   @todo improve the way statsTable() fn uses apply() to make it more generic
#
################################################################################

################################################################################
# The following functions are useful for calculating performance metrics for
# trading strategies, fund performance, or other post-trade (even post-balance-
# sheet) performance assessment.  In general, we're working with monthly scaled
# data (although these functions should be useful for daily or weekly scales as
# well), and in returns rather than prices.

# There are books and books full of these calculations.  Some are even useful.
# Notation is sloppy, however, so the user should be aware of how the specific
# calculations are implemented.

# Default input values assume regular monthly returns.  Scale arguements can be
# used to specify the number of observations during a year (e.g., 12 = monthly
# returns).

# The risk free rate (rf) MUST be in the same periodicity as the data going in.

# ------------------------------------------------------------------------------

# Load any required libraries.

if(!require("fCalendar", quietly=TRUE)) {
    stop("package", sQuote("fOptions"), "is needed.  Stopping")
}
if(!require("fOptions", quietly=TRUE)) {
    stop("package", sQuote("fOptions"), "is needed.  Stopping")
}
if(!require("tseries", quietly=TRUE)) {
    stop("package", sQuote("tseries"), "is needed.  Stopping")
}
if(!require("Hmisc", quietly=TRUE)) {
    stop("package", sQuote("Hmisc"), "is needed.  Stopping")
}
# also requires Brian's extra_moments.R file, should we always include it?

# ------------------------------------------------------------------------------

checkDataMatrix = function (x, na.rm = FALSE, ...)
{ # @author Peter Carl

    # Description:

    # This function was created to make the different kinds of data classes
    # at least _seem_ more fungible.  It allows the user to pass in a data
    # object without being concerned that the function requires a matrix,
    # data.frame, or timeSeries object.  By using this, the function "knows"
    # what data format it has to work with.

    # Inputs:
    # x: a matrix, data.frame or timeSeries object of data to be checked and
    # transformed if necessary.
    # na.rm: removes rows of data with NA's in any column.  This is useful
    # for calculating correlations, etc., that require complete data.  Note
    # that this passes back only the first contiguous set of data, to retain
    # the regular timeseries nature of the data.

    # Outputs:
    # Produces a matrix object with named rows and columns.

    # FUNCTION:
    if (class(x) == "numeric") {
        # Note that column and row labels will be blank.
        y = as.matrix(x)
    }

    if (class(x) == "matrix") {
        y = x
    }

    if (class(x) == "data.frame") {
        y = as.matrix(x)
    }

    if (class(x) == "timeSeries") {
        # timeSeries objects keep the data in a matrix and append a set of
        # meta-data.  We just need the information stored in the 'Data' slot.
        #y = x at Data
        y = seriesData(x)
    }

    if (na.rm) {
        #y = na.contiguous(y)
        y = na.omit(y)
    }
    return(y)

}

# ------------------------------------------------------------------------------

checkDataVector = function (R, na.rm = TRUE, quiet = TRUE, ...)
{ # @author Peter Carl

    # Description:

    # This function was created to make the different kinds of data classes
    # at least _seem_ more fungible.  It allows the user to pass in a data
    # object without being concerned that the function requires a matrix,
    # data.frame, or timeSeries object.  By using this, the function "knows"
    # what data format it has to work with.

    # Inputs:
    # x: the data provided.
    # na.rm: default behavior is to remove NA's
    # quiet: if true, it will throw warnings when errors are noticed

    # Outputs:
    # Produces a numeric vector or an NA and any associated warnings.

    # FUNCTION:

    x = as.vector(R)

    # First, we'll check to see if we have more than one column.
    if (NCOL(x) > 1) {
        if(!quiet)
            warning("The data provided is not a vector or univariate time series.  Used only the first column")
        x = x[,1]
    }

    # Second, we'll hunt for NA's and remove them if required
    if (any(is.na(x))) {
        if(na.rm) {
            # Try to remove any NA's
            x = x[!is.na(x)]
            if(!quiet){
                warning("The following slots have NAs.")
                warning(paste(x at na.removed," "))
            }
        }
        else {
            if(!quiet)
                warning("Data contains NA's.")
        }
    }

    # Third, we'll check to see if we have any character data
    if (!is.numeric(x)){
        if(!quiet)
            warning("The data is not numeric.")
        # Try to coerce the data
        x = as.numeric(x)
    }

    # Fourth, we'll see if we have more than one data point.
    if (NROW(x) <= 1) {
        if(!quiet)
            warning("Only one row provided.  Returning NA.")
        return(NA)
    }

    # @todo: Add check for stopifnot(is.atomic(y))???

    return(as.vector(x))

}

# ------------------------------------------------------------------------------

annualizedReturn = function (R, geometric = TRUE, scale = 12)
{ # @author Peter Carl

    # Description:

    # An average annualized return is convenient for comparing returns.
    # @todo: This function could be used for calculating geometric or simple
    # returns

    # n = periods under analysis
    # f = number of periods in a year (daily f = 252, monthly f = 12,
    # quarterly f = 4)

    # arithmetic average: ra = (f/n) * sum(ri)
    # geometric average: rg = product(1 + ri)^(f/n) - 1

    # @todo: don't calculate for returns less than 1 year

    # FUNCTION:

    x = checkDataVector(R)
    y = x[!is.na(x)]
    n = length(y)
    # currently only uses geometric return to annualize returns
    #return(prod(1 + x)^(scale/length(x)) - 1)
    return(prod(1 + x, na.rm = TRUE)^(scale/n) - 1)
}

# ------------------------------------------------------------------------------

annualizedStdDev = function (R, rf = 0, scale = 12)
{ # @author Peter Carl

    # To annualize standard deviation, we multiply by the square root of the
    # number of observations per year.

    x = checkDataVector(R)

    # return(sqrt(scale)*sqrt(var(x - rf)))
    return(sqrt(scale)*sqrt(var(x, na.rm = TRUE)))
}

# ------------------------------------------------------------------------------

sharpeRatio = function (R, rf = 0)
{ # @author Peter Carl

    # DESCRIPTION:
    # The Sharpe ratio is simply the return per unit of risk (represented by
    # variability).  The higher the Sharpe ratio, the better the combined
    # performance of "risk" and return.

    # The Sharpe Ratio is a risk-adjusted measure of return that uses
    # standard deviation to represent risk.

    # Inputs:
    # R: in this case, the function anticipates having a return stream as input,
    #    rather than prices.
    # rf: the risk free rate MUST be in the same periodicity as the data going in.

    # Outputs:
    # This function returns a Sharpe ratio for the same periodicity of the
    # data being input (e.g., monthly data -> monthly SR)

    # FUNCTION:

    x = checkDataVector(R)
    return((mean(x, na.rm = TRUE) - rf)/sd(x, na.rm = TRUE))
    #return((mean(x) - rf)/sd(x))
}

# ------------------------------------------------------------------------------

annualizedSharpeRatio = function (R, rf = 0, scale = 12)
{ # @author Peter Carl

    # DESCRIPTION:

    # Using an annualized Sharpe Ratio is useful for comparison.  The annualized
    # Sharpe ratio is computed by dividing the annualized mean monthly excess
    # return by the annualized monthly standard deviation of excess return.

    # @todo: monthly standard deviation of ***excess*** return

    # Inputs:
    # R: in this case, the function anticipates having a return stream as input,
    #     rather than prices.
    # rf: the risk free rate MUST be in the same periodicity as the data going in.

    # FUNCTION:

    x = checkDataVector(R)

    return((annualizedReturn(x, scale = scale) - (rf * scale))/annualizedStdDev(x, rf = rf, scale = scale))

}

# ------------------------------------------------------------------------------

modSharpe = function (R, rf = 0, p=0.95, scale=1)
{ # @author Brian G. Peterson

    # DESCRIPTION:
    # The Sharpe ratio is simply the return per unit of risk (represented by
    # variability).  The higher the Sharpe ratio, the better the combined
    # performance of "risk" and return.

    # The Sharpe Ratio is a risk-adjusted measure of return that uses
    # standard deviation to represent risk.

    # A number of papers now recommend using a "modified Sharpe" ratio
    # using a Modified Cornish-Fisher VaR as the measure of Risk.

    # Inputs:
    # R: in this case, the function anticipates having a return stream as input,
    #    rather than prices.
    #
    # rf: the risk free rate MUST be in the same periodicity as the data going in.
    #
    # p: probability at which to calculate the modified VaR (defaults to 95%)

    # Outputs:
    # This function returns a modified Sharpe ratio for the same periodicity of the
    # data being input (e.g., monthly data -> monthly SR)

    # FUNCTION:

    x = checkDataVector(R)
    return( (mean(x, na.rm = TRUE) - rf)*scale/(modifiedVaR(x, p)*sqrt(scale)) )
    #return((mean(x) - rf)/sd(x))
}

# ------------------------------------------------------------------------------

semiDeviation = function (R)
{ # @author Peter Carl

    # DESCRIPTION:
    # This function should just be a wrapper of downsideDeviation with
    # MAR = mean(x)
    # see below

    # FUNCTION:

    x = checkDataVector(R)

    y = subset(x,x < mean(x))
    return(sqrt(sum((y - mean(x))^2)/(length(y)-1)))

}

# ------------------------------------------------------------------------------

downsideDeviation = function (R, MAR = 0)
{ # @author Peter Carl

    # DESCRIPTION:
    # Downside deviation, similar to semi deviation, eliminates positive returns
    # when calculating risk.  To calculate it, we take the returns that are less
    # than the target (or Minimum Acceptable Returns (MAR)) returns and take the
    # differences of those to the target.  We sum the squares and divide by the
    # total number of returns to get a below-target semi-variance.

    # This is also useful for calculating semi-deviation by setting
    # MAR = mean(x)

    # FUNCTION:

    x = checkDataVector(R)

    y = subset(x,x < MAR)
    # return(sqrt(sum((y - MAR)^2)/(length(y)-1)))
    return(sqrt(sum((y - MAR)^2)/(length(y))))
    # @todo verify which calculation is correct

}

# ------------------------------------------------------------------------------

sortinoRatio = function (R, MAR = 0)
{ # @author Brian G. Peterson
  # modified from function by Sankalp Upadhyay <sankalp.upadhyay [at] gmail [dot] com> with permission

    # Description:
    # Sortino proposed to better account for skill and excess performance
    # by using only downside semivariance as the measure of risk.

    # R     return vector
    # MAR   minimum acceptable return
    # Function:

    x = checkDataVector(R)

    result = mean (R - MAR) / downsideDeviation(R, MAR)

    #Return:
    result
}

# ------------------------------------------------------------------------------

maxDrawdown = function (R)
{ # @author Peter Carl

    # DESCRIPTION:
    # To find the maximum drawdown in a return series, we need to first
    # calculate the cumulative returns and the maximum cumulative return to
    # that point.  Any time the cumulative returns dips below the maximum
    # cumulative returns, it's a drawdown.  Drawdowns are measured as a
    # percentage of that maximum cumulative return, in effect, measured from
    # peak equity.

    # FUNCTION:

    x = checkDataVector(R)

    cumulativeReturn = cumprod(1+x)
    maxCumulativeReturn = cummax(cumulativeReturn)
    drawdown = cumulativeReturn/maxCumulativeReturn - 1
    # if you want to see the drawdown series, plot(drawdown,type="l")
    return(min(drawdown))

}

# ------------------------------------------------------------------------------

alpha = function (x, y, rf = 0)
{ # @author Peter Carl

    # DESCRIPTION:
    # This is a wrapper for calculating a CAPM alpha.

    # Inputs:
    # x: vector of returns for the asset being tested
    # y: vector of returns for the benchmark the asset is being gauged against
    # x and y are assumed to be matching periods
    # rf: risk free rate in the same periodicity as the returns.  May be a vector
    #     of the same length as x and y.

    # Output:
    #

    # FUNCTION:

    assetReturns.vec = checkDataVector(x)
    indexReturns.vec = checkDataVector(y)

    if (length(assetReturns.vec) != length(indexReturns.vec))
        stop("Returns to be assessed have unequal time periods. Are there NA's in the data?")

    # Make these excess returns
    assetExcessRet.vec = assetReturns.vec - rf
    indexExcessRet.vec = indexReturns.vec - rf

    # regress
    model.lm = lm(assetExcessRet.vec ~ indexExcessRet.vec)

    alpha = coef(model.lm)[[1]]
#    beta = coef(model.lm)[[2]]
    alpha
}

# ------------------------------------------------------------------------------

beta = function (x, y, rf = 0, digits = 4)
{ # @author Peter Carl

    # DESCRIPTION:
    # This is a wrapper for calculating a CAPM beta.

    # Inputs:
    # x: vector of returns for the asset being tested
    # y: vector of returns for the benchmark the asset is being gauged against
    # x and y are assumed to be matching periods and NO TEST IS MADE TO CHECK
    # rf: risk free rate in the same periodicity as the returns.  May be a vector
    #     of the same length as x and y.

    # Output:
    #

    # FUNCTION:

    assetReturns.vec = checkDataVector(x)
    indexReturns.vec = checkDataVector(y)

    if (length(assetReturns.vec) != length(indexReturns.vec))
        stop("Returns to be assessed have unequal time periods. Are there NA's in the data?")

    # Make these excess returns
    assetExcessRet.vec = assetReturns.vec - rf
    indexExcessRet.vec = indexReturns.vec - rf

    # regress
    model.lm = lm(assetExcessRet.vec ~ indexExcessRet.vec)

#    alpha = coef(model.lm)[[1]]
    beta = coef(model.lm)[[2]]
    beta
}

# ------------------------------------------------------------------------------

omega = function (R, L = 0, method = c("simple", "interp", "binomial", "blackscholes"), output = c("point", "full"), rf = 0)
{ # @author Peter Carl

    # DESCRIPTION
    # Keating and Shadwick (2002) proposed Omega (referred to as Gamma in their
    # original paper) as a way to capture all of the higher moments of the
    # returns distribution.  Mathematically, Omega is:
    #   integral[L to b](1 - F(r))dr / integral[a to L](F(r))dr
    # where the cumulative distribution F is defined on the interval (a,b).
    # L is the loss threshold that can be specified as zero, return from a
    # benchmark index, or an absolute rate of return - any specified level.
    # When comparing alternatives using Omega, L should be common.  Input data
    # can be transformed prior to calculation, which may be useful for
    # introducing risk aversion.

    # This function returns a vector of Omega, useful for plotting.  The
    # steeper, the less risky.  Above it's mean, a steeply sloped Omega also
    # implies a very limited potential for further gain.

    # Omega has a value of 1 at the mean of the distribution.

    # Omega is sub-additive.  The ratio is dimensionless.

    # Kazemi, Schneeweis, and Gupta (2003), in "Omega as a Performance Measure"
    # shows that Omega can be written as:
    #   Omega(L) = C(L)/P(L)
    # where C(L) is essentially the price of a European call option written
    # on the investment and P(L) is essentially the price of a European put
    # option written on the investment.  The maturity for both options is
    # one period (e.g., one month) and L is the strike price of both options.

    # The numerator and the denominator can be expressed as:
    #   exp(-rf) * E[max(x - L, 0)]
    #   exp(-rf) * E[max(L - x, 0)]
    # with exp(-rf) calculating the present values of the two, where rf is
    # the per-period riskless rate.

    # The first three methods implemented here focus on that observation.
    # The first method takes the simplification described above.  The second
    # uses the Black-Scholes option pricing as implemented in fOptions.  The
    # third uses the binomial pricing model from fOptions.  The second and
    # third methods are not implemented here.

    # The fourth method, "interp", creates a linear interpolation of the cdf of
    # returns, calculates Omega as a vector, and finally interpolates a function
    # for Omega as a function of L.

    # FUNCTION

    x = checkDataVector(R)

    if(method == "simple") {
        numerator = exp(-rf) * mean(max(x - L, 0))
        denominator = exp(-rf) * mean(max(L - x, 0))
        omega = numerator/denominator
    }

    if(method == "interp") {
    #

        a = min(x)
        b = max(x)

        xcdf = ecdf(x)
        f <- approxfun(xcdf$x,xcdf$y,method="linear",ties="ordered")

        if(output == "full") {
            omega = cumsum(1-f(xcdf$x))/cumsum(f(xcdf$x))
        }
        else {
        # returns only the point value for L
            # to get a point measure for omega, have to interpolate
            omegafull = cumsum(1-f(xcdf$x))/cumsum(f(xcdf$x)) # ????????
            g <- approxfun(xcdf$x,omegafull,method="linear",ties="ordered")
            omega = g(L)
        }
    }
    result = omega
    result

}

# ------------------------------------------------------------------------------

trackingError = function (R, Ri, scale = 12)
{ # @author Peter Carl

    # DESCRIPTION
    # trackingError = sqrt(sum(assetReturns.vec - indexReturns.vec)^2 /
    #                   (length(assetReturns.vec) - 1)) * sqrt(scale)

    # Inputs:
    # Outputs:

    # FUNCTION
    assetReturns.vec = checkDataVector(R)
    indexReturns.vec = checkDataVector(Ri)

    trackingError = sqrt(sum(assetReturns.vec - indexReturns.vec)^2 / (length(assetReturns.vec) - 1)) * sqrt(scale)

    trackingError

}

# ------------------------------------------------------------------------------

activePremium = function (R, Ri, scale = 12)
{ # @author Peter Carl

    # DESCRIPTION
    # activePremium = (annualizedReturn(assetReturns.vec, scale = scale) -
    #                  annualizedReturn(indexReturns.vec, scale = scale) )

    # Inputs:
    # Outputs:

    # FUNCTION
    assetReturns.vec = checkDataVector(R)
    indexReturns.vec = checkDataVector(Ri)

    activePremium = (annualizedReturn(assetReturns.vec, scale = scale) - annualizedReturn(indexReturns.vec, scale = scale))

    activePremium

}

# ------------------------------------------------------------------------------

informationRatio = function (R, Ri, scale = 12)
{ # @author Peter Carl

    # DESCRIPTION
    # informationRatio = activePremium/trackingError

    # Inputs:
    # Outputs:

    # FUNCTION
    assetReturns.vec = checkDataVector(R)
    indexReturns.vec = checkDataVector(Ri)

    activePremium = activePremium(assetReturns.vec,indexReturns.vec, scale = scale)
    trackingError = trackingError(assetReturns.vec,indexReturns.vec, scale = scale)

    informationRatio = activePremium/trackingError

    informationRatio
}

# ------------------------------------------------------------------------------
rollingStat = function (R, period = 3, trim = TRUE, FUN, ...)
{ # @author Peter Carl

    # DESCRIPTION
    # Function to apply any time series function over a rolling period,
    # say the last three months.  Pass in the function name as FUN,
    # e.g., FUN = "mean".  Useful for passing in several periods and functions
    # for comparing different series.

    # Inputs:
    # period  the number of periods over which to calculate the statistic.
    #         Setting the period to 0 calculates the statistic "from inception" or
    #         using all of the data passed in.
    # FUN     function to apply over e.g., FUN = "mean" or FUN = "cumulativeReturns"

    # Output:

    # FUNCTION

    x = checkDataVector(R)

    if (length(x) < period){
    # @todo: instead of stopping, should this warn and return NAs for length(x)?
        stop("Data set is not long enough. Reduce the evaluation period or provide a longer time period of data. Also, check the data for NA's.")
    }
    if (period == 0) {
        y = as.matrix(x[1:length(x)])
    }
    else
        y = as.matrix(x[(length(x)- period + 1):length(x)])
    answer=apply(y, MARGIN = 2, FUN = FUN, ...)
    return(answer)
    # Example includes how to pass in arguements to the function - must be in order:
    # > rollingStat(gg.ts at Data[,1],period=12,FUN="annualizedSharpeRatio",rf=.03/12)
    # [1] 1.476426
    # > rollingStat(gg.ts at Data[,1],period=3,FUN="annualizedSharpeRatio",rf=.03/12)
    # [1] 6.804358
    # > rollingStat(gg.ts at Data[,1],period=3,FUN="annualizedSharpeRatio",rf=0)
    # [1] 8.253612

}

# ------------------------------------------------------------------------------

cumulativeReturn = function (R, geometric = TRUE)
{ # @author Peter Carl

    # This is a useful function for calculating cumulative return over a period
    # of time, say a calendar year.  Can produce simple or geometric return.

    x = checkDataVector(R)

    if (!geometric)
        return(sum(x, na.rm = TRUE))
    else {
        return(prod(1+x, na.rm = TRUE)-1)
    }
}

# ------------------------------------------------------------------------------

colCumprods = function (R, na.rm = TRUE, ...)
{ # @author Peter Carl

    # if we do this, then cumulating a set of monthly returns is easy
    x = checkDataMatrix(R)

    if (na.rm) {
        result = apply(na.omit(x), MARGIN = 2, FUN = cumprod,
            ...)
    }
    else {
        result = apply(x, MARGIN = 2, FUN = cumprod, ...)
    }
    result
}

# ------------------------------------------------------------------------------

colCummax = function (R, na.rm = TRUE, ...)
{ # @author Peter Carl

    # to get to drawdown calculations, we need colCummax
    x = checkDataMatrix(R)

    if (na.rm) {
        result = apply(na.omit(x), MARGIN = 2, FUN = cummax,
            ...)
    }
    else {
        result = apply(x, MARGIN = 2, FUN = cummax, ...)
    }
    result
}

# ------------------------------------------------------------------------------

statsTable = function (R, metrics=c("mean","sd"), metricsNames=c("Average Return","Standard Deviation"), ...)
{ # @author Peter Carl

    # DESCRIPTION:
    # This function creates a table of statistics from vectors of functions and
    # labels passed in.  The resulting table is formatted such that metrics
    # are calculated separately for each column of returns in the data object.

    # Inputs:
    # Assumes an input of period returns.  Scale arguements can be used to
    # specify the number of observations during a year (e.g., 12 = monthly
    # returns).

    # Outputs:
    # A table with calculated metrics for each column

    # The idea here is to be able to pass in sets of metrics and values, like:

    # metrics = c(downsideDeviation(x,MAR=mean(x)), sd(subset(x,x>0)),
    # sd(subset(x,x<0)), downsideDeviation(x,MAR=MAR),
    # downsideDeviation(x,MAR=rf), downsideDeviation(x,MAR=0),maxDrawdown(x))

    # metricsNames = c("Semi Deviation", "Gain Deviation", "Loss Deviation",
    # paste("Downside Deviation (MAR=",MAR*scale*100,"%)", sep=""),
    # paste("Downside Deviation (rf=",rf*scale*100,"%)", sep=""),
    # paste("Downside Deviation (0%)", sep=""), "Maximum Drawdown" )

    # Here's how it's working right now:
    # > statsTable(monthlyReturns.ts,metrics=c("modifiedVaR","mean"),
    # metricsNames=c("modVaR","mean"),p=.95)
    #            Actual   S&P500TR
    # modVaR 0.04186461 0.06261451
    # mean   0.00945000 0.01013684

    # Passing in attributes doesn't quite work.  The issue is apparent in:
    # > statsTable(monthlyReturns.ts,metrics=c("modifiedVaR", "modifiedVaR"),
    #   metricsNames=c("Modified VaR","Traditional VaR"), modified=c(TRUE,FALSE))
    #                     Actual   S&P500TR
    # Modified VaR    0.06340849 0.09334976
    # Traditional VaR 0.06340849 0.09334976
    # Warning messages:
    # 1: the condition has length > 1 and only the first element will be used in: if (modified) {
    # 2: the condition has length > 1 and only the first element will be used in: if (modified) {
    # 3: the condition has length > 1 and only the first element will be used in: if (modified) {
    # 4: the condition has length > 1 and only the first element will be used in: if (modified) {

    # FUNCTION:

    y = checkDataMatrix(R)

    # for each column, do the following:
    for(column in 1:columns) {
#     x = as.vector(y[,column])
    x = as.matrix(y[,column])
    x.length = length(x)
    x = x[!is.na(x)]
    x.na = x.length - length(x)

        # apply the calculations
        values = vector('numeric', 0)
    for(metric in metrics) {
#        values = c(values,metric)
            # I'm not quite sure why this requires the as.matrix() coersion
            newvalue = apply(as.matrix(x), MARGIN = 2, FUN = metric, ...)
            values = c(values,newvalue)
        }

        if(column == 1) {
            resultingtable = data.frame(Value = values, row.names = metricsNames)
        }
        else {
            nextcolumn = data.frame(Value = values, row.names = metricsNames)
            resultingtable = cbind(resultingtable, nextcolumn)
        }
    }
    colnames(resultingtable) = columnnames
    resultingtable

}

# ------------------------------------------------------------------------------

rollingCorrelation = function (x, y, n, trim = TRUE, na.rm = FALSE, ...)
{# @author Peter Carl

    # DESCRIPTION:
    # This is a wrapper for providing n-period trailing correlations for the
    # data provided.  Inspired by rollFun() written by Diethelm Wurtz.

    # Specifically, we're wrapping the basic function cor(), which has:
    # function (x, y = NULL, use = "all.obs", method = c("pearson",
    #     "kendall", "spearman"))

    # Inputs:
    # Assumes an input of regular period returns.
    #    x: a numeric vector, matrix, data frame, or timeSeries.
    #       The resulting data frame will contain columns of cor(x,y)
    #       for each period.
    #
    #    y: 'NULL' (default) or a vector, matrix, data frame or timeSeries
    #       with compatible dimensions to 'x'.
    #
    #    Currently assumes that BOTH x and y are provided
    #    @todo: Make it so we can _either_ give a matrix or data frame
    #    for 'x' _or_ give both 'x' and 'y' (as in cov())
    #
    # Assumes that x and y are sequenced exactly the same and are regular.

    # Outputs:
    # A data.table of n-period trailing correlations for each column in y.

    # FUNCTION:

    # target.vec is the vector of data we want correlations for; we'll get it
    # from x
    target.vec = checkDataVector(x)
    xrows = nrow(x)
    rownames = rownames(x)

    # data.matrix is a vector or matrix of data we want correlations against;
    # we'll take it from y
    data.matrix = checkDataMatrix(y)
    columns=ncol(y)
    columnnames = colnames(y)

    if (!trim) {
        result.df = as.data.frame(matrix(data = NA, nrow = (n-1), ncol = columns, byrow = FALSE, dimnames = NULL))
        colnames(result.df) = columnnames
    }
    #  For each period (or row):
    for(row in n:xrows) {
        values = vector('numeric', 0)
        # Get the subset of returns on which to calulate correlation
        trailing.data = data.matrix[(row-n+1):row,]
        trailing.vec = target.vec[(row-n+1):row]

        # Calculate correlation
        values = cor(trailing.vec,trailing.data,...)

        if(row == n && trim) {
            result.df = data.frame(Value = values, row.names = rownames[n])
        }

        else {
            nextrow = data.frame(Value = values, row.names = rownames[row])
            colnames(nextrow) = columnnames
            result.df = rbind(result.df, nextrow)
        }
    }

    if (!trim) {
        rownames(result.df) = rownames
    }
    else
        rownames(result.df) = rownames(n:xrows)

    colnames(result.df) = columnnames

    result.df

    # Example:
    # > head(rollingCorrelation(manager.ts at Data[,1],edhec.ts at Data,n=12))
    #            Convertible Arbitrage CTA Global Distressed Securities
    # 2003-11-28             0.2591101  0.2762218             0.7516556
    # 2003-12-31             0.2162078  0.2477113             0.7452179
    # 2004-01-30             0.3918575  0.3489062             0.7562063
    # 2004-02-27             0.5331404  0.3905645             0.7088004
    # 2004-03-31             0.5730389  0.3010877             0.5694478
    # 2004-04-30             0.5146946  0.3762283             0.4374524
    #            Emerging Markets Equity Market Neutral Event Driven
    # 2003-11-28        0.6678183             0.4133534    0.6872263
    # 2003-12-31        0.7739188             0.3758590    0.8494740
    # 2004-01-30        0.7805586             0.4148372    0.8338275
    # 2004-02-27        0.7353874             0.3444552    0.8181069
    # 2004-03-31        0.7072259             0.2740104    0.6270411
    # 2004-04-30        0.4430823             0.3918143    0.4966760
    #            Fixed Income Arbitrage Funds of Funds Global Macro Long/Short Equity
    # 2003-11-28              0.6681346      0.7413791    0.5698372         0.4429661
    # 2003-12-31              0.6583804      0.7705968    0.5647109         0.6949394
    # 2004-01-30              0.7116529      0.7427369    0.5972492         0.6849833
    # 2004-02-27              0.7580839      0.6831231    0.6555057         0.6300653
    # 2004-03-31              0.7057303      0.6172516    0.6253257         0.5233462
    # 2004-04-30              0.7583853      0.5305704    0.6740996         0.3518156
    #            Merger Arbitrage Relative Value Short Selling
    # 2003-11-28        0.6890493      0.6858798   -0.23238159
    # 2003-12-31        0.7397387      0.8062422   -0.41123820
    # 2004-01-30        0.7219990      0.7883262   -0.36321115
    # 2004-02-27        0.6862517      0.7605837   -0.28807086
    # 2004-03-31        0.5921739      0.6766063   -0.24154960
    # 2004-04-30        0.5073817      0.5257858   -0.04663322

}

# ------------------------------------------------------------------------------

rollingFunction = function (R, n, trim = TRUE, na.rm = TRUE, firstcolumn = 1, digits = 4, rf = 0, FUN = "mean", ...)
{# @author Peter Carl

    # DESCRIPTION:
    # This is a wrapper for providing n-period trailing calculations for the
    # data and functions provided.

    # Inputs:
    # R: a matrix, data frame, or timeSeries of regular period returns.
    # n: the number of periods over which a function is to be calculated. Use
    #     the value zero (0) to roll the statistic from inception
    # FUN: the function to apply to the data
    # trim:
    # na.rm:
    # firstcolumn:
    # digits: number of decimal places to round the display to.

    # Outputs:
    # A data.table of n-period trailing calculations for each column
    # in x.

    # Inspired by rollFun() written by Diethelm Wurtz.  We've extended the idea
    # to all the columns provided.

    # FUNCTION:

    data.mat = checkDataMatrix(R)
    columns=ncol(data.mat)
    # @todo: remove NA's before setting column names
    columnnames = colnames(data.mat)
    # @todo: handle empty column names
    if(is.null(columnnames))
        stop("Column names are empty.  If you are trying to pass in a timeSeries, use seriesData() rather than the explicit @Data slot.")
    rows = nrow(data.mat)
    rownames = rownames(data.mat)

    if(rows < n)
        stop("Data set is too short. Select a shorter evaluation period or provide a longer time period.  Also, check the data for NA's.")

    # for each column, do the following:
    for(column in firstcolumn:columns) {
        valueNames = vector('character', 0)
        values = vector('numeric', 0)
    x = checkDataVector(data.mat[,column])

        if(n == 0) {
            period = 0
            n = 12
        }
        else
            period = n

        for(row in n:rows) {
            subperiod = x[1:row]
            values = c(values,rollingStat(subperiod, period = period, FUN = FUN, ...))
            valueNames = c(valueNames,rownames[row])
        }
        if (!trim) {
            values = c(rep(NA, (n - 1)), values)
            valueNames = c(rownames[1:n-1],valueNames)
        }
        if(column == 1) {
            resultingtable = data.frame(Value = values, row.names = valueNames)
        }

        else {
            nextcolumn = data.frame(Value = values, row.names = valueNames)
            resultingtable = cbind(resultingtable, nextcolumn)
        }
    }
    colnames(resultingtable) = columnnames
    result = round(resultingtable, digits)
    result

    # Examples:
    # > rollingFunction(gg.ts[,1],n=3,FUN="annualizedReturn")
    #                     Manager
    # 2002-02-28           0.0306
    # 2002-03-31           0.0521
    # 2002-04-30           0.0387
    # ...
    # > rollingFunction(gg.ts[,1],n=3,trim=FALSE,FUN="annualizedReturn")
    #                     Manager
    # 2001-12-31               NA
    # 2002-01-31               NA
    # 2002-02-28           0.0306
    # 2002-03-31           0.0521
    # 2002-04-30           0.0387
    # ...
    # > rollingFunction(gg.ts[,1],n=3,trim=FALSE,FUN="annualizedSharpeRatio")
    #                     Manager
    # 2001-12-31               NA
    # 2002-01-31               NA
    # 2002-02-28           1.5302
    # 2002-03-31           4.3768
    # 2002-04-30           6.9640
    # ...
    # > rollingFunction(gg.ts[,1],n=3,trim=FALSE,FUN="annualizedSharpeRatio",rf=.03/12)
    #                     Manager
    # 2001-12-31               NA
    # 2002-01-31               NA
    # 2002-02-28           0.0298
    # 2002-03-31           1.8587
    # 2002-04-30           1.5598
    # ...
}

# ------------------------------------------------------------------------------


rollingRegression <- function(formula, data, width, ...)
{# @author Douglas Bates
 # @author modified by Peter Carl

    # DESCRIPTION:
    # This code was posted to the R-help mailing list in response to a question
    # posted by Ajay Shah.  For the full discussion, see:
    # http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg19544.html

    # @todo: make inputs consistant with other functions

    mCall = match.call()
    mCall$width = NULL
    mCall[[1]] = as.name("lm")
    mCall$x = mCall$y = TRUE # now mCall contains lm(y = TRUE, x = TRUE)
    bigfit = eval(mCall, parent.frame())
    ncoef = length(coef(bigfit))
    nr = nrow(data)
    width = as.integer(width)[1]
    stopifnot(width >= ncoef, width <= nr)
    y = bigfit$y
    x = bigfit$x
    terms = bigfit$terms
    inds = embed(seq(nr), width)[, rev(seq(width))]
    sumrys <- lapply(seq(nrow(inds)),
        function(st) {
            ind = inds[st,]
            fit = lm.fit(x[ind, , drop = FALSE], y[ind])
            fit$terms = terms
            class(fit) = "lm"
            summary(fit)
        })
    list(coefficients = sapply(sumrys, function(sm) coef(sm)[,"Estimate"]),
        Std.Error = sapply(sumrys, function(sm) coef(sm)[,"Std. Error"]),
        sigma = sapply(sumrys, "[[", "sigma"),
        r.squared = sapply(sumrys, "[[", "r.squared"))
}

# ------------------------------------------------------------------------------

################################################################################
# $Log: performance-analytics.R,v $
# Revision 1.36  2006/12/09 15:26:08  brian
# - clean up function comments some more in advance of public release
# Bug 128
#
# Revision 1.35  2006/12/06 15:07:41  brian
# - add todo items for Herfindahl, Geltner, and runs
# - remove stub notation on Sortino
# Bug 128
#
# Revision 1.34  2006/12/06 03:39:08  brian
# - add Sortino function modified from  Sankalp Upadhyay
#
# Revision 1.33  2006/12/05 21:52:32  brian
# - add modSharpe fn
# - update function list to show all functions in file
# - update copyright
# - fix indentation
# Bug 128
#
# Revision 1.32  2006/11/17 21:18:18  peter
# - added rf option to rollingFunction
#
# Revision 1.31  2006/11/07 03:36:12  peter
# - added informationRatio, trackingError, and activePremium function
#
# Revision 1.30  2006/10/31 03:58:11  peter
# - added function for rollingRegression
#
# Revision 1.29  2006/10/19 14:58:13  peter
# - fixed an issue with single-point calculation in rollingFunction when using n=0
#
# Revision 1.28  2006/10/19 13:45:53  peter
# - added "trim" option to rollingCorrelation
#
# Revision 1.27  2006/10/19 02:00:21  peter
# - rollingFunction with n=0 will trim first 12 observations
#
# Revision 1.26  2006/10/19 01:36:41  peter
# - rollingFunction now calculates statistics "from inception"
#
# Revision 1.25  2006/10/17 04:36:35  peter
# - minor modifications to rollingCorrelations function
#
# Revision 1.24  2006/10/10 04:39:48  peter
# - reverted back from na.contiguous to na.omit; not working for single column in matrix
#
# Revision 1.23  2006/10/09 13:59:37  peter
# - replaced na.omit with na.contiguous to retain time series nature of data in
# checkDataMatrix when na.rm = T
#
# Revision 1.22  2006/10/09 13:53:56  peter
# - added na.omit to dataCheckMatrix to ensure complete data sets for particular
# calculations
#
# Revision 1.21  2006/10/07 03:24:42  peter
# - improved NA handling in checkDataVector
# - added NA removal to some calculations, no longer needed
#
# Revision 1.20  2006/10/05 01:13:25  peter
# - rollingStat now passes arguements to functions with parameters
# - added a trim option to rollingFunction for lining up data in charts
#
# Revision 1.19  2006/10/04 14:13:55  peter
# - minor cleanup on RollingFunction
#
# Revision 1.18  2006/10/03 19:19:45  peter
# - fixed checkDataMatrix for the timeSeries case with missing column names
#
# Revision 1.17  2006/09/30 03:46:44  peter
# - fixed typo in annualizedSharpeRatio()
#
# Revision 1.16  2006/09/30 03:42:51  peter
# - finished applying data checks
# - fixed rollingFunction
#
# Revision 1.15  2006/09/30 00:54:21  peter
# - added checkDataVector() and applied to most calculations
#
# Revision 1.14  2006/09/29 14:19:19  peter
# - added fCalendar as an explicit required library
#
# Revision 1.13  2006/09/29 14:18:04  peter
# - fixed matrix case for checkDataMatrix
#
# Revision 1.12  2006/09/29 14:13:38  peter
# - added checkDataMatrix() to make data objects appear more fungible
#
# Revision 1.11  2006/09/28 14:53:40  peter
# - debugging rollingCorrelations(); x currently needs to be a matrix
#
# Revision 1.10  2006/09/26 21:56:55  brian
# - replace loud require directives with quiet requires
#
# Revision 1.9  2006/09/20 19:13:16  peter
# - repaired AnnualizedReturn call
#
# Revision 1.8  2006/09/20 19:08:54  peter
# - fixed annualizedSharpeRatio() calculation
#
# Revision 1.7  2006/09/20 17:08:44  peter
# Added rollingFunction() to calculate single-input metrics on a rolling period basis.
# Currently not working.
#
# Revision 1.6  2006/09/12 13:37:53  peter
# - snapshot 2006-09-11
#
# Revision 1.4  2006/09/12 13:09:50  brian
# - added omega function
# - snapshot 2006-08-26 v2
#
# Revision 1.3  2006/09/12 13:06:32  peter
# - snapshot 2006-08-23
#
# Revision 1.2  2006/09/12 12:56:39  peter
# - second revision snapshot 2006-08-17
#
# Revision 1.1 2006-09-12 07:46:18 peter
# - initial revision snapshot 2006-08-16
# Bug 128
################################################################################


More information about the R-SIG-Finance mailing list