\documentclass[a4paper]{report} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage{RJournal} \usepackage{amsmath,amssymb,array} \usepackage{booktabs} \usepackage{float} \usepackage{Sweave} \usepackage[parfill]{parskip} \usepackage[round]{natbib} %\VignetteEngine{utils::Sweave} %\VignetteIndexEntry{backtestGraphics} %\VignetteEncoding{UTF-8} \begin{document} \SweaveOpts{concordance=FALSE} %% do not edit, for illustration only \sectionhead{Contributed research article} \volume{XX} \volnumber{YY} \year{20ZZ} \month{AAAA} \begin{article} \title{Backtest Graphics} \author{by Yanrong Song, Zijie Zhu, David Kane, Ziqi Lu, Karan Tibrewal, Fan Zhang} \maketitle \abstract{ The \pkg{backtestGraphics} package provides an interactive graphical interface to visualize backtests results for a variety of financial instruments (equities, futures, credit default swaps, et cetera).The \pkg{Shiny} interface, returned by the package, contains a sidebar panel with summary and detailed performance statistics, such as number of instruments, cumulative and annualized profit and loss, Sharpe ratio, top three drawdowns, and the three best and worst performers in the backtest. The main panel of interface houses interactive plots of profit and loss (P\&L), net market value (NMV) and gross market value (GMV). Lastly, the package also supports vizualization of multiple strategies, sub-strategies, and overlapping portfolios that may be present in a single backtest result. } \section{Introduction} Backtesting is the process of testing trading strategies on prior time horizons to measure the effectiveness of the given strategy. It helps investors understand and optimize their trading strategies [\cite{backtest}]. The \pkg{backtestGraphics} package provides an interactive \pkg{Shiny} interface to visualize backtest results for a variety of financial instruments (equities, futures, and credit defualt swaps, among others) [\cite{shiny}]. These visualizations enable users to employ their human perception to process a lot of backtest data quickly and approximately [\cite{bostock}]. \noindent It is important to note here that \pkg{backtestGraphics} doesn't run backtests, but instead provides graphical visualization of backtest results. To illustrate this subtle distinction, consider a simple trading strategy that buys the top 10 shares of the S\&P 500, and shorts the bottom 10. Now, to test how the strategy performs historically, the user may use backtesting software like \emph{Quantopian}, or even R packages like \pkg{backtest} [\cite{backtest}].The output from such backtesting is generally in the form of large data frames which are difficult to interpret. As of such, the user may feed these data frames, i.e. the holding data, to \pkg{backtestGraphics}. The package then constructs interactive \pkg{dygraph} plots, and calculates essential summary statistics which are easy to interpret and explore [\cite{dygraphs}]. \noindent The \pkg{Shiny} interface returned by the package consists of a sidebar panel that includes a "Summary" and a "Detail" tab. The former provides the user with summary statistics, such as average gross market value (GMV), net market value (NMV), number of instruments, cumulative and annualized profit and loss (P\&L), Sharpe ratio and best and worst performing months. The "Detail" tab provides information about the best and worst performers, as well as the biggest drawdowns. \noindent The main panel of the \pkg{Shiny} interface houses interactive plots for cumulative and point-in-time P\&L, NMV, GMV and number of contracts. The interactive nature of these plots provide the user with additional flexibility in exploring her backtest results by enabling her to zoom into specific time periods, or learn the value of a given response variable on a specific date. \noindent Additionally, to accomodate for more complex backtests, \pkg{backtestGraphics} allows the user to subset seamlessly between overlapping portfolios, and multiple strategies and sub-strategies. For example, suppose the user splits his portfolio into two halves: the first half trades using the aforementioned trading strategy on a weekly basis, while the second uses the same strategy on a bi-weekly basis. Although these strategies overlap every other week, the user may want to explore how a particular strategy does in isolation. In order to do so, the user simply has to select the appropriate options from the dropdown menus provided in the side bar panel. \section{Data} The package comes with three sample data frames: \code{commodity}, \code{equity}, and \code{credit}. These are backtest results for commodity futures, equities, and credit default swaps (CDS), respectively. We use these data frames to demonstrate the capabilities of the backtestGraphics package. (Note: The user may also use her own data frame, as long as it adheres to the format specified in the documentation). \subsection{Equity} Let us begin with some details of \code{equity}. Equities are financial instruments that represent an ownership claim entitling the holder to dividends and ownership rights of the company for which the equities are issued [\cite{equities}].The \code{equity} data frame contains holding data for a trading strategy dealing with equities. <>= library(backtestGraphics) library(dplyr) ## use the following dplyr code to show a more interesting section of ## the data frame df <- equity %>% filter(name %in% unique(equity$name)[1:3], date %in% as.Date(c("2005-05-02", "2005-05-03", "2005-05-04"))) %>% arrange(date, name) print.data.frame(df, row.names = FALSE) @ \begin{itemize} \item{\code{name:} column contains the each stock's symbol.} \item{\code{date:} column contains the trading date. (Note: this column must be of \code{Date} type)}. \item{\code{sector:} column contains the sector of each stock.} \item{\code{nmv:} column contains the NMV at the beginning of the trading day for each stock. (Note: all values must be converted to the same currency). } \item{\code{pnl:} column contains P\&L of each stock. (Note: all values must be converted to the same currency). } \end{itemize} \subsection{Commodity} \noindent Now, let us look at \code{commodity}. Simply put, a commodity futures contract is a legally binding agreement that provides for the delivery of the specified value of the said commodity at a specific time period in the future [\cite{futures}]. The \code{commodity} data frame consists of holding data from a trading strategy dealing in such commodity futures. <>= ## use the following dplyr code to show a more interesting section of ## the data frame df <- (commodity %>% filter(id %in% c("CO", "FC", "W"), date %in% as.Date(c("2003-02-03", "2003-02-07"))) %>% arrange(id, name))[c(1,6,15,16,17),] print.data.frame(df, row.names = FALSE) @ The columns \code{name}, \code{date}, \code{sector}, \code{nmv}, and \code{pnl} have the same meaning as the ones in \code{equity}. The remaining columns are listed below: (Note: these columns are optional). \begin{itemize} \item{\code{ID:} column contains ID information for different commodities. } \item{\code{strategy:} column contains the strategy number. } \item{\code{substrategy:} column contains the sub-strategy number. } \item{\code{gmv:} column contains the GMV of a given commodity at the start of each trading day (Note: all values must be converted to the same currency). } \item{\code{contract:} column contains the number of contracts of a given commodity at the start of each trading day. } \end{itemize} \subsection{Credit} Finally, we look at \code{credit}. Credit Default Swaps (CDS) are a type of credit derivative that transfers the credit risk from one group of investors to another, in exchange for payment [\cite{CDS}]. The \code{credit} data frame contains holding data from a trading strategy dealing in such CDS. <>= df <- credit %>% filter(name %in% unique(credit$name)[1:3], date %in% unique(credit$date)[1:3]) %>% arrange(date, name) print.data.frame(df, row.names = FALSE) @ \noindent Most of the columns here are the same as those in e\code{equity} and \code{commodity}. Notice that the \code{strategy} column here is used to denote backtest results for different trading frequencies. Towards this, the user must provide all backtest observations for each instrument, even when the position on the instrument is held and unchanged. This is because \pkg{backtestGraphics} uses the time gaps between each observation to determine the trading frequencies, and to calculate the annualized statistics. \section{Using the \code{backtestGraphics} Interface} In this section, we discuss the functionality of various components of the \pkg{Shiny} interface returned by the \pkg{backtestGraphics} function. To use backtestGraphics, the user is required to pass in a data frame with \code{date}, \code{ID/name}, \code{NMV} and \code{P\&L} columns to the \pkg{backtestGraphics} function. Additionally, the user can also pass in optional columns, such as \code{sector}, \code{GMV}, \code{contracts}, \code{strategy}, \code{substrategy}, and \code{portfolio}. \noindent As an example, let us look at the interface for the \code{commodity} data frame. Type the following command, and click the ``Visualize" button on the Shiny interface returned by the function call. \begin{example} library(backtestGraphics) backtestGraphics(x = commodity) \end{example} \begin{figure}[H] \centering \includegraphics[width = \textwidth, height = 2.5in]{img/overallscreen.png} \caption{The \code{backtestGraphics} interface for the \code{commodity} data frame. The interface houses three dropdown menus on the top-left to toggle between Strategies, Portfolios, and Instruments. The user may click on the ``Vizualize" button to look at the summary statistics and the interactive plots. Finally, the user can also switch between plots with the radio buttons at the bottom of the plots. } \end{figure} Now, let us individually examine the various components of the Shiny interface: \subsection{Dropdown Menus} The interface contains three dropdown menus on the top-left to toggle between different Strategies, Portfolios, and Instruments. By selecting different combinations of these menus, the user can visualize different subsets of strategies, portfolios and instruments. \noindent If the given data frame does not have a \code{strategy} column, or a \code{portfolio} column, the respective dropdown menus will be fixed to ``Strategy Summary" and ``Portfolio Summary". If the user selects a combination that is incompatible with the given data frame, the interface will generate an error message and prompt the user to select a different combination. \noindent Figure 2 provides a screenshot of the interface's dropdown menus. \begin{figure}[H] \centering \includegraphics[width = \textwidth, height = 3in]{img/dropdown.png} \caption{Dropdown menus: The user can subset her backtest results along different strategies, portfolios and instruments by selecting appropriate choices from the dropdown menus.} \end{figure} \subsection{The ``Summary" Tab} The ``Summary" tab is present in the left-panel of the \code{Shiny} interface. It provides summary statistics for the backtest result. Figure 3 contains a screenshot of the ``Summary" tab for our commodity data frame, along with corresponding documentation. \begin{figure}[H] \centering \includegraphics[width = 3in, height = 4in]{img/summary.png} \caption{The ``Summary" tab for the \code{commodity} data frame. On clicking the ``Visualize" button, the function slices the data frame according to inputs in the drop down menus to calculate summary statistics for the specified data subset. } \end{figure} Documentation of summary statistics: \begin{itemize} \item{Start and End date : The backtest period.} \item{Allocated Capital : The amount of capital allocated to the portfolio. We use this number to calculate return on allocated capital. If the user does not specify this value, the corresponding entry will be \code{NULL} and `backtestGraphics` will use the highest GMV to calculate return on allocated capital. } \item{Average GMV : Average GMV of the portfolio over the backtest period. GMV is calculated by taking the absolute value of NMV of each instrument. } \item{Number of Instruments : The number of different instruments in the given subset of the data frame.} \item{Cumulative P\&L : Sum of all P\&L over the backtest period.} \item{Annualized P\&L : This is calculated as the Average P\&L times an annualization factor. Our package observes the date interval between observations in the data frame to determine the trading frequency. The assumed annualization factor of daily trading is $\sqrt{252}$, while that for other trading frequency is $\sqrt{\frac{365}{date \ gap}}$.} \item{Annualized P\&L volatility : The standard deviation of all annualized P\&L's.} \item{Annualzied Return : Annualized average return on allocated capital. We calculate average return rates by dividing P\&L by the allocated capital amount. The average return rates are then annualized by multiplying them with the annualization factor. } \item{Sharpe Ratio : A measure of risk-adjusted return. We calculate this by dividing the mean P\&L by the standard deviation of P\&L. We assume that $\text{risk-free rate} = 0$.} \item{Best Month and Worst Month : The month with the highest P\&L, and the month with the lowest P\&L} \end{itemize} \subsubsection{The ``Detail" Tab} The ``Detail" tab is also present in the left-panel of the \code{Shiny} interface. It provides information about the best and worst performers, as well as the biggest drawdowns. Figure 4 displays a screenshot of the ``Detail" tab for our \code{commodity} data frame, along with corresponding documentation. \begin{figure}[H] \centering \includegraphics[width = 2.5in, height = 3.5in]{img/detail.png} \caption{``Detail" tab with the \code{commodity} data frame. } \end{figure} \noindent Documentation of detailed statistics: \begin{itemize} \item{Top three drawdowns : The three biggest declines in P\&L from peak to trough. The table contains the start and end dates of the drawdowns, along with the actual value of the drawdowns. } \item{Best and worst three performers : The three best performers are the instruments with the highest three cumulative P\&L's. Similarly, the three worst performers are the instruments with the lowest three cumulative P\&L's. (Note: even if the user is looking at a subset of individual instruments, we will display information about the best and worst performers across all instruments. However, if the user is looking at data from a specific sector, we will display information about the best and worst performers in the specified sector). } \end{itemize} \subsection{Plots} The right-panel of our \code{Shiny} interface provides interactive plots for cumulative and point-in-time P\&L, NMV, GMV and number of contracts. \noindent Figure 5 provides a screenshot of the plots. \begin{figure}[H] \centering \includegraphics[width = 4in, height = 3in]{img/plots.png} \caption{A screenshot of the plots with the \code{commodity} data frame. The P\&L plots is displayed on the top, while the market value plots are on the bottom. Radio buttons at the bottom of the plots allow the user to seamlessly switch between plots. } \end{figure} \noindent We have made the plots interactive using the \pkg{dygraphs} package [\cite{dygraphs}]. The user can zoom in to a specific time periods in a plot by dragging the corresponding region with the mouse, or changing the time slider bar, which is located below the plots. To go back to the original time scale, the user can merely double click on the respective plot. Additionally, by hovering the mouse over the plot, the user can get specific values of the response variable of the plot, on the specific date. The color of the plots is based on the sign of the point-in-time P\&L. Green bars represent profits, while the red bars represent losses. \section{Conclusion} The \pkg{backtestGraphics} package provides a simple \code{Shiny} interface to visualize backtest results. Inside the package, we have provided three sample data frames. The user can use the code provided in this document to test the package with these data frames. \noindent Our interface provides the user with summary, as well as detailed statistics about the backtest result. The user can subset her data according to strategies, sub-strategies, and instruments. Additionally, we also provide interactive plots that show movement of important variables over the backtest period. \noindent These are powerful tools to uncover trends and anomalies in the data. By decomposing the data frame into different sectors and instruments, the user can gain important insights into his trading strategies over different sectors and time periods. The interactive plots provide additional flexibility to dive into the nitty-gritty details of the backtest. \noindent Going forward, the \pkg{backtestGraphics} package can easily be customized to suit the needs of a specific user. Such customization may include, among others, adding new response variables to plots, redesigning plots, or introducing additional summary statistics. We invite users to take advantage of this, and redesign the \pkg{backtestGraphics} interface according to their preferences. \pagebreak \section{Authors} \address{Yanrong Song\\ Financial Engineering\\ Columbia University \\ New York, NY, USA\\} \email{yrsong129@gmail.com} \address{Zijie Zhu\\ Computer Science and Economics \\ Williams College\\ Williamstown, MA, USA\\} \email{zijie.miller.zhu@gmail.com} \address{David Kane\\ Haravard University \\ IQSS\\ 1737 Cambridge Street\\ CGIS Knafel Building, Room 350\\ Cambridge, MA, USA\\} \email{dave.kane@gmail.com} \address{Ziqi Lu\\ Economics and Mathematics\\ Williams College \\ Williamstown, MA, USA\\} \email{ziqi.lu@williams.edu} \address{Karan Tibrewal\\ Mathematics and Computer Science \\ Williams College\\ Williamstown, MA, USA\\} \email{karan.tibrewal@williams.edu} \address{Fan Zhang\\ Economics and Statistics \\ Williams College\\ Williamstown, MA, USA\\} \email{fan.zhang@williams.edu} \bibliography{backtestGraphics} \end{article} \end{document}