[R] Managing output
Phil Spector
spector at stat.berkeley.edu
Thu Aug 27 00:45:17 CEST 2009
Noah -
Just allocate the maximum length that you'd ever need, and
then change the length of the vector at the end of the program.
By the way, here's a little demonstration of what a difference
pre-allocation makes:
> system.time({x <- NULL;for(i in 1:10000)x <- c(x,rnorm(1))})
user system elapsed
0.584 0.000 0.588
> system.time({x <- numeric(10000);for(i in 1:10000)x[i] <- rnorm(1)})
user system elapsed
0.120 0.000 0.122
The difference will be greater if you actually do something inside of
the loop.
To clarify my first point, use something like this:
> x = numeric(10000)
> j = 0
> for(i in 1:10000){
+ r = rnorm(1)
+ if(r < .1){
+ j = j + 1
+ x[j] = r
+ }
+ }
> length(x) = j
The overallocation doesn't actually slow things down:
> system.time({x <- numeric(10000);for(i in 1:10000)x[i] <- rnorm(1)})
user system elapsed
0.120 0.000 0.122
> system.time({x <- numeric(100000);for(i in 1:10000)x[i] <- rnorm(1);length(x) <- 10000})
user system elapsed
0.128 0.000 0.126
- Phil
On Wed, 26 Aug 2009, Noah Silverman wrote:
> Phil,
>
> Pre-allocation makes sense. However, I don't know the size of my resulting
> vector when starting. In my loop, I only pull off results that meet a
> certain threshold.
>
> -N
>
> On 8/26/09 2:07 PM, Phil Spector wrote:
>> Noah -
>> I would strongly advise you to preallocate the result vector
>> using numeric() or rep(), and then enter the values based on subscripts.
>> Allowing objects to grow inside of loops is one of
>> the biggest mistakes an R programmer can make.
>>
>> - Phil Spector
>> Statistical Computing Facility
>> Department of Statistics
>> UC Berkeley
>> spector at stat.berkeley.edu
>>
>>
>> On Wed, 26 Aug 2009, Noah Silverman wrote:
>>
>>> The actually process is REALLY complicate, I just gave a simple example
>>> for the list.
>>>
>>> I have a lot of steps to process the data before I get a final
>>> "score". (nested loops, conditional statements, etc.)
>>>
>>> Right now, I'm just printing the scores to the screen. I'd like to
>>> accumulate them in some kind of data structure so I can either write
>>> them to disk or graph them.
>>>
>>> -N
>>>
>>> On 8/26/09 12:27 PM, Erik Iverson wrote:
>>>> How about ?append, but R is vectorized, so why not just
>>>>
>>>> result_list<- 2*item^2 , or for more complicated tasks, the
>>>> apply/sapply/lapply/mapply family of functions?
>>>>
>>>> In general, the "for" loop construct can be avoided so you don't have to
>>>> think about messy indexing. What exactly are you trying to do?
>>>>
>>>> -----Original Message-----
>>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>>>> On Behalf Of Noah Silverman
>>>> Sent: Wednesday, August 26, 2009 2:20 PM
>>>> To: r help
>>>> Subject: [R] Managing output
>>>>
>>>> Hi,
>>>>
>>>>
>>>> Is there a way to build up a vector, item by item. In perl, we can
>>>> "push" an item onto an array. How can we can do this in R?
>>>> I have a loop that generates values as it goes. I want to end up with a
>>>> vector of all the loop results.
>>>>
>>>> In perl it woud be:
>>>>
>>>> for(item in list){
>>>> result<- 2*item^2 (Or whatever formula, this is just a pseudo
>>>> example)
>>>> Push(@result_list, result) (This is the step I can't do in R)
>>>> }
>>>>
>>>>
>>>> Thanks!
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
More information about the R-help
mailing list