[R] for loop optimization help

Mon Aug 27 16:25:57 CEST 2012

On Aug 27, 2012, at 1:53 AM, Jinsong Zhao wrote:

> On 2012-08-27 9:35, David Winsemius wrote:
>>
>> On Aug 26, 2012, at 5:06 PM, Jinsong Zhao wrote:
>>
>>> Hi there,
>>>
>>> In my code, there is a for loop like the following:
>>>
>>>  pmatrix <- matrix(NA, nrow = 99, ncol = 10000)
>>>  qmatrix <- matrix(NA, nrow = 99, ncol = 3)
>>>  paf <- seq(0.01, 0.99, 0.01)
>>>  for (i in 1:10000) {
>>>      p.r.1 <- rnorm(1000, 1, 0.5)
>>>      p.r.2 <- rnorm(1000, 2, 1.5)
>>>      p.r.3 <- rnorm(1000, 3, 1)
>>>      pmatrix[,i] <- quantile(c(p.r.1, p.r.2, p.r.3), paf)
>>>  }
>>>  for (i in 1:99) {
>>>     qmatrix[i,] <- quantile(pmatrix[i,], c(0.05, 0.5, 0.95))
>>>  }
>>>
>>> Because of the number of loop is very large, e.g., 10000 here, the
>>> code is very slow.
>>
>> I would think that picking the seq(0.01, 0.99, 0.01) items in the  
>> first
>> case and the 500th, 5000th and the 9500th in the second case,  rather
>> than asking for what `quantile` would calculate,  would surely be  
>> more
>> "statistical", in the sense of choose order statistics anyway. Likely
>> much faster.
>>

Also possible that drawing the normals as 10,000 by 1,000 matrices  
(all at once, outside the loop) would save time (at the cost of added  
space requirements.

>
> Yes, you are right. Following your suggestions, the execution time  
> of `sort' is much shorter than `quantile' in the following code:
>
>  pmatrix <- matrix(NA, nrow = 99, ncol = 10000)
>  qmatrix <- matrix(NA, nrow = 99, ncol = 3)
>  paf <- seq(0.01, 0.99, 0.01)
>  for (i in 1:10000) {
>      p.r.1 <- rnorm(1000, 1, 0.5)
>      p.r.2 <- rnorm(1000, 2, 1.5)
>      p.r.3 <- rnorm(1000, 3, 1)
>      pmatrix[,i] <- sort(c(p.r.1, p.r.2, p.r.3))[paf*3000]
>  }
>  qmatrix <- pmatrix[,c(0.05, 0.5, 0.95)*10000]
>
> Thanks again.
>
> Regards,
> Jinsong

David Winsemius, MD
Alameda, CA, USA