[R-SIG-Finance] Simulate the stock market for back testing strategy ---R bootstrap function

Tim Hesterberg timh at insightful.com
Mon Feb 11 19:23:24 CET 2008


>...what functions in R can do bootstrap resampling while
>keeping the autocorrelation in the original data? (I
>only know function of sample()). Would this resmapled
>data do any good on back testing? 

The two most common bootstrap approaches for time series are:
* block bootstrap
* fitting a model (e.g. ARIMA) and bootstrapping residuals/innovations
  from that model; use the bootstrapped innovations and fitted model
  to construct new series.

These can be combined:
- fit a model,
- block-bootstrap the residuals from that model.
The model could be simpler than would be necessary for a purely
model-based approach, as long as it captures the bulk of the
correlation structure; the block bootstrap would help capture the
rest.

Another alternative is the matched-block bootstrap:
Edward Carlstein, Kim-Anh Do, Peter Hall, Tim Hesterberg, and Hans
R. Kuensch, "Matched-Block Bootstrap for Dependent Data", Bernoulli,
4(3), 1998, 305-328.
A tech report with more details is:
Hesterberg, Tim C. (1997), "Matched-Block Bootstrap for Long Memory
Processes", Technical Report No. 66, Research Department, MathSoft,
Inc. 1700 Westlake Ave. N., Suite 500, Seattle, WA 98109.
http://www.insightful.com/Hesterberg/articles/tech66-matchBlock.pdf

Dirk Eddelbuettel mentioned a block bootstrap with blocks of varying
length, 1 to 6.  In general, a block bootstrap with a fixed block
length is better than one with random lengths.  The main shortcoming
of a block bootstrap is that it loses correlations at block
boundaries.  With a fixed block length, of say length 10, you lose
1/10 of the first-order autocorrelations or other dependencies, 2/10
of the second order, etc. and all of the autocorrelations of order 11
or higher.  With random block lengths, say 1-19, you lose more
low-order autocorrelations, while saving some of the autocorrelations
of order 11-19 - but these are typically less important.

Also, block lengths should be longer.  A quick rule of thumb is
sqrt(n).  There are more rigorous rules, but they are data-dependent.

I would be quite leery of a block bootstrap for back testing stock
trading.  Losing dependence at block boundaries results in a
constructed series that may differ substantially from reality.

You also need to beware of reproducing historical artifacts by doing
simple bootstraps.  If Ford lost value during the historical period,
and you do simple bootstraps without adjustment, then Ford will tend
to lose value in your bootstrap samples.  Other stocks that went up
historically would gain in the bootstrap samples.  This can distort
the evaluation of trading strategies.

I discuss some of this in the bootstrap short course I teach.

Tim Hesterberg

========================================================
| Tim Hesterberg       Senior Research Scientist       |
| timh at insightful.com  Insightful Corp.                |
| (206)802-2319        1700 Westlake Ave. N, Suite 500 |
| (206)283-8691 (fax)  Seattle, WA 98109-3044, U.S.A.  |
|                      www.insightful.com/Hesterberg   |
========================================================
Download S+Resample from www.insightful.com/downloads/libraries

Advanced Programming in S-PLUS: San Antonio TX, March 26-27, 2008.
Bootstrap Methods and Permutation Tests: San Antonio, March 28, 2008.



More information about the R-SIG-Finance mailing list