[R] Is R the right choice for simulating first passage times of random walks?

William Dunlap wdunlap at tibco.com
Thu Jul 28 03:56:10 CEST 2011


You can replace the inner loop
  for (i in 1:length) {
    if (s[i] < 0 && s[i + 1] >= 0) {
      z[i] = z[i] + 1
    }
  }
with the faster
  z <- z + (diff(s>=0)==1)
(Using the latter forces you fix up the length of
z to be one less than you declared -- your loop never
touched the last entry in it.)

My code relies on the factor that the logicals FALSE
and TRUE are converted to integer 0 and 1, respectively,
when you do arithmetic on them.

E.g., here is your version written as a function, f1, and
2 equivalent ones, f2 and f3, without the inner loop:
f1 <- function(n = 100000, # number of simulations
               length = 100000) # length of iterated sum
{
   z = rep(0, times=length) # orig had length+1 but last entry was never used

   for (i in 1:n) {
           x = c(0, sign(rnorm(length)))
           s = cumsum(x)
           for (i in 1:length) {
                   if (s[i] < 0 && s[i + 1] >= 0) {
                           z[i] = z[i] + 1
                   }
           }
   }
   z
}

f2 <- function(n = 100000, # number of simulations
               length = 100000) # length of iterated sum
{
   z = rep(0, times=length)
   l1 <- length+1
   for (i in 1:n) {
           x = c(0, sign(rnorm(length)))
           s = cumsum(x)
           z <- z + ( s[-l1]<0 & s[-1]>=0 )
   }
   z
}

f3 <- function(n = 100000, # number of simulations
               length = 100000) # length of iterated sum
{
   z = rep(0, times=length)

   for (i in 1:n) {
           x = c(0, sign(rnorm(length)))
           s = cumsum(x)
           z <- z + (diff(s>=0)==1)
   }
   z
}

and here are some timing and correctness tests:
  > set.seed(1) ; system.time( z1 <- f1(1000, 1e5) )
     user  system elapsed
   278.10    0.25  271.44
  > set.seed(1) ; system.time( z2 <- f2(1000, 1e5) )
     user  system elapsed
    37.29    0.84   38.02
  > set.seed(1) ; system.time( z3 <- f3(1000, 1e5) )
     user  system elapsed
    40.11    0.80   40.17
  > identical(z1, z2) && identical(z1, z3)
  [1] TRUE

You might try using sample(c(-1,1), size=length, replace=TRUE)
instead of sign(rnorm(length)).  I don't know if there would
be any speed difference, but it frees you from the worry that
rnorm() might return an exact 0.0.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Paul Menzel
> Sent: Wednesday, July 27, 2011 4:36 PM
> To: r-help at r-project.org
> Subject: [R] Is R the right choice for simulating first passage times of random walks?
> 
> Dear R folks,
> 
> 
> I need to simulate first passage times for iterated partial sums. The
> related papers are for example [1][2].
> 
> As a start I want to simulate how long a simple random walk stays
> negative, which should result that it behaves like n^(-½). My code looks
> like this.
> 
> -------- 8< -------- code -------- >8 --------
> n = 100000 # number of simulations
> 
> length = 100000 # length of iterated sum
> z = rep(0, times=length + 1)
> 
> for (i in 1:n) {
> 	x = c(0, sign(rnorm(length)))
> 	s = cumsum(x)
> 	for (i in 1:length) {
> 		if (s[i] < 0 && s[i + 1] >= 0) {
> 			z[i] = z[i] + 1
> 		}
> 	}
> }
> plot(1:length(z), z/n)
> curve(x**(-0.5), add = TRUE)
> -------- 8< -------- code -------- >8 --------
> 
> This code already runs for over half an hour on my system¹.
> 
> Reading about the for loop [3] it says to try to avoid loops and I
> probably should use a matrix where every row is a sample.
> 
> Now my first problem is that there is no matrix equivalent for
> `cumsum()`. Can I use matrices to avoid the for loop?
> 
> My second question is, is R the right choice for such simulations? It
> would be great when R can also give me a confidence interval(?) and also
> try to fit a curve through the result and give me the rule of
> correspondence(?) [4]. Do you have pointers for those?
> 
> I glanced at simFrame [5] and read `?simulate` but could not understand
> it right away and think that this might be overkill.
> 
> Do you have any suggestions?
> 
> 
> Thanks,
> 
> Paul
> 
> 
> ¹ AMD Athlon(tm) X2 Dual Core Processor BE-2350, 2,1 GHz
> 
> 
> [1] http://www-stat.stanford.edu/~amir/preprints/irw.ps
> [2] http://arxiv.org/abs/0911.5456
> [3] http://cran.r-project.org/doc/manuals/R-intro.html#Repetitive-execution
> [4] https://secure.wikimedia.org/wikipedia/en/wiki/Function_(mathematics)
> [5] http://finzi.psych.upenn.edu/R/library/simFrame/html/runSimulation.html


More information about the R-help mailing list