[R] need technique for speeding up R dataframe individual element insertion (no deletion though)
jim holtman
jholtman at gmail.com
Thu Aug 13 14:25:01 CEST 2009
First of all, do the strptime conversions one time outside the loop.
I would guess that if you ran Rprof on the code, most of the time is
in that routine -- did you run Rprof?
Also you are going through the loop one too many times; your ending
value is 'length(cam$end_date)' and then you are indexing one greater
than that in the loop 'x2=strptime(cam$end_date[i+1], "%d/%m/%Y");'
FYI -- you don't need the semicolons at the end of the statements.
On Thu, Aug 13, 2009 at 8:07 AM, Ishwor<ishwor.gurung at gmail.com> wrote:
> Hi fellas,
>
> I am working on a dataframe cam and it involves comparison within the
> 2 columns - t1 and t2 on about 20K rows and 14 columns.
>
> ###
> cap = cam; # this doesn't take long. ~1 secs.
>
>
> for( i in 1:length(cam$end_date))
> {
> x1=strptime(cam$end_date[i], "%d/%m/%Y");
> x2=strptime(cam$end_date[i+1], "%d/%m/%Y");
>
> t1= cam$vol[i];
> t2= cam$vol[i+1];
>
> if(!is.na(x2) && !is.na(x1) && !is.na(t1) && !is.na(t2))
> {
> if( (x2>=x1) && (t1==t2) ) # date and vol
> {
> cap$levels[i]=1; #make change to specific dataframe cell
> cap$levels[i+1]=1;
> }
> }
> }
> ###
>
> Having coded that, i ran a timing profile on this section and each
> 1000'th row comparison is taking ~1.1 minutes on a 2.8Ghz dual-core
> box (which is a test box we use).
> This obviously computes to ~21 minutes for 20k which is definitely not
> where we want it headed. I believe, optimisation(or even different way
> to address indexing inside dataframe) can be had inside the innermost
> `if' and specifically in `cap$levels[i]=1;' but I am a bit at a loss
> having scoured the documentation failing to find anything of value.
> So, my question remains are there any general/specific changes I can
> do to speed up the code execution dramatically?
>
> Thanks folks.
>
> --
> Regards,
> Ishwor Gurung
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
More information about the R-help
mailing list