[R-SIG-Finance] help (regarding block bootstrap)

Tue Mar 17 20:24:39 CET 2009

Hello.

Some time ago I write a seminar work about regressing the oil price on the CDAX. There, I used a nonparamtetric-block-bootstrap approach by hand, because I needed to resample pairs of blocks. I worked with sample(). I think there is some need of further optimization, but the code should give the idea of block-sampling:

The example:
if series is       {3, 6, 7, 2, 1, 5}
- non-overlapping: {(3,6,7), (2,1,5)}
- overlapping:     {(3,6,7), (6,7,2), (7,2,l), (2,1,5)}

For the overlapping case you have 4 blocks with length 3. Spoken in time indices the blocks have the following structure:
1:3
2:4
3:5
4:6

So, one has to resample the starting time indices 1:4 and add 2 to each time index to grab the data right:

x <- c(3,6,7,2,1,5)

x_sample  <- numeric(4*3) #4 blocks, each of length 3
mean_boot <- numeric(10000)

for (i in 1:10000)

{

for (j in 0:3)

{

  idx <- sample(1:4,1,replace=TRUE) #the starting index

  x_sample[(3*j+1):(3*j+3)] <- x[(idx):(idx+2)]

}

mean_boot[i] <- mean(x_sample)

}

Next, the non-overlapping example with 2 blocks. Here we have the following time structure:
1:3
4:6

So one has to resample 1 and 4 and add 2 to grab the data. If the series is longer, one would recognize, that the first time index can be described by the row: 3*t+1, so one only has to draw with replacement from (1:2) or equivalently (0:1):

x <- c(3,6,7,2,1,5)

x_sample  <- numeric(2*3) #2 blocks of length 3
mean_boot <- numeric(10000)

for (i in 1:10000)

{

for (j in 0:1)

{

  idx <- sample(0:1,1,replace=TRUE)

  idx <- 3*idx+1 #the starting index

  x_sample[(3*j+1):(3*j+3)] <- x[(idx):(idx+2)]

}

mean_boot[i] <- mean(x_sample)

}

And when you ask yourself what block length would be the right one, well Politis and White (2004) have the answer: http://econ.ucsd.edu/~mbacci/white/pub_files/hwcv-093.pdf
Also pay attention to the corrections of the algorithm: http://www.economics.ox.ac.uk/members/andrew.patton/SBblockCORRECTION_jan08.pdf

But the most important thing, pay attention to the R implementation:
http://www.math.ucsd.edu/~politis/SOFT/PPW/ppw.R

Hope it works.... it is some time ago since I played around with it, but maybe it is some food for though :)

Matthias.

--- Matthieu Stigler <matthieu.stigler at gmail.com> schrieb am Di, 17.3.2009:

> Von: Matthieu Stigler <matthieu.stigler at gmail.com>
> Betreff: Re: [R-SIG-Finance] help (regarding block bootstrap)
> An: r-sig-finance at stat.math.ethz.ch
> CC: "Yana Roth" <yana.roth at yahoo.com>
> Datum: Dienstag, 17. März 2009, 14:35
> Brian G. Peterson a écrit :
> > Yana Roth wrote:
> >> Hello,
> >> I am trying to do block reasampling to rearrange
> my data and not succeed to do random permutation and
> assugnement.
> >> I would like to divide original time series to
> subsamples and then to rearange this subsamples randomly.
> >>  Function tsboot works only if I need to check
> statistic, I am interested in just rearranging the data
> while keeping its structure.
> >>  The problem is defined as follows.
> >> 1. I define llentgh of block , b.
> >> 2.Divide an original time series by b and receive
> k=n/b subsamles.
> >> 3. I need to generate random vector of integers
> from 1 to k
> >> 4 Let Z*(j) be for j=1....k be the j th row of a
> matrix with num of rows equal to number of blocks and number
> of columns equal to number of simulations.
> >> 5. Assigne to each Z*(j) the blocks according to
> generated random vector(each column of matrix is a different
> order of permutations)
> > For future reference, please provide reproducible code
> as per the posting guidelines.  It makes it easier for
> others to help you.  Also, please use a desciptive subject,
> as we all get a quite a lot of mail.
> > 
> > Your procedure appears incorrect.
> > Your steps 3-5 look like a homework assignment, so
> I'm going to ignore those and focus on the block
> bootstrap, which has some applicability to other members of
> this list in financial time series analysis.
> > 
> 
> 
> Thanks Brian for these examples!
> 
> Actually even if it is homework I would be really
> interested in the answer ;-) this is a question I always
> wanted to find out, maybe is it the right time to ask? I
> looked in source code of tsboot() but got lost
> 
> Does anyone has an idea about how to generate block
> resampling with function sample()? And with overlapping and
> non-overlapping blocks? That is, (example just taken from
> Maddala and Li 1998, bootstraping cointegrating
> relationships in journal of econometrics 80,2 also in their
> book unit roots, coint and struc change page 328) you pick
> blocks:
> 
> if series is {3, 6, 7, 2, 1, 5}
> -non-overlapping:  {(3,6,7), (2,1,5)}
> -overlapping:  {(3,6,7), (6,7,2), (7,2, l), (2, 1,5)}
> 
> and then sample those blocks with replacement. I don't
> have a clear idea about how do to that on R... Thanks!
> 
> a<-1:100
> boot1<-sample(a, replace=TRUE) #length 1
> 
> > I suspect that you simply misunderstood the
> "statistic" parameter of tsboot().  I expect that
> you do indeed intend to use the bootstrapped data to
> calculate one or more statstics, this is what the statistic
> parameter is for.
> > 
> > Block bootstrapping works by randomly sampling blocks
> of length l from your original series.  The tsboot function
> also applies one or more statistics to the bootstrapped
> data, and uses the multiple samples to calculate the bias
> and standard error for those statistics, providing you with
> a sensitivity analysis for those statistics on your data.
> > 
> > Using the data series "acme" included with
> R, you would do something like:
> > 
> > library(boot)
> > library(PerformanceAnalytics)
> > data(acme)
> > 
> > #calculate the sensitivity of standard deviation on
> the data:
> >
> tsboot(tseries=acme[,2],statistic=sd,R=1000,l=12,sim="fixed",endcorr=FALSE,n.sim=1000)
> 
> > # use blocks of length 12 (one year) to # create 1000
> bootstrapped time series
> > # each of length 1000 observations
> > 
> > #Returns:
> > #Bootstrap Statistics :
> > #      original       bias    std. error
> > #t1* 0.05362889 0.0001614213 0.001925484
> > 
> > # calculate sensitivity of VaR:
> >
> tsboot(tseries=acme[,2],statistic=VaR.CornishFisher,R=1000,sim="fixed",l=12,endcorr=FALSE,n.sim=1000)
> 
> > 
> > #Returns:
> > #Bootstrap Statistics :
> > #    original      bias    std. error
> > #t1* 0.227064 0.009412978 0.007284343
> > 
> > Normally, this is what you want.  The random
> bootstrapped series itself is not useful to you, except to
> calculate a statistic or statistics of interest, and
> understand their sensitivity.  If you want the bootstrapped
> series returned, you can modify the code of the tsboot
> function to do what you want. If you want to apply your
> steps 3-5 to the bootstrapped data, see the documentation of
> tsboot() for an example of defining a function to use as the
> statistic parameter.
> > 
> > Regards,
> > 
> >  - Brian
> 
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only.
> -- If you want to post, subscribe first.