[R] need technique for speeding up R dataframe individual element insertion (no deletion though)
Bill.Venables at csiro.au
Bill.Venables at csiro.au
Thu Aug 13 14:44:58 CEST 2009
Why do you need an explicit loop at all?
(Also, your loop goes over i in 1:length(cam$end_date) but your code refers to cam$end_date[i+1] -->||<--!!)
Here is a suggestion. You want to identify places where the date increases but the volume does not change. OK, where?
ind <- with(cam, {
dx <- as.numeric(diff(strptime(end_date, "%d/%m/%Y")))
dt <- diff(vol)
which(dx > 0 & dt == 0)
})
Now adjust the new data frame
cap <- within(cam, {
levels[ind] <- 1
levels[ind+1] <- 1
})
Of course this is untested code, so caveat emptor!
Bill Venables.
________________________________________
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Ishwor [ishwor.gurung at gmail.com]
Sent: 13 August 2009 22:07
To: r-help at r-project.org
Subject: [R] need technique for speeding up R dataframe individual element insertion (no deletion though)
Hi fellas,
I am working on a dataframe cam and it involves comparison within the
2 columns - t1 and t2 on about 20K rows and 14 columns.
###
cap = cam; # this doesn't take long. ~1 secs.
for( i in 1:length(cam$end_date))
{
x1=strptime(cam$end_date[i], "%d/%m/%Y");
x2=strptime(cam$end_date[i+1], "%d/%m/%Y");
t1= cam$vol[i];
t2= cam$vol[i+1];
if(!is.na(x2) && !is.na(x1) && !is.na(t1) && !is.na(t2))
{
if( (x2>=x1) && (t1==t2) ) # date and vol
{
cap$levels[i]=1; #make change to specific dataframe cell
cap$levels[i+1]=1;
}
}
}
###
Having coded that, i ran a timing profile on this section and each
1000'th row comparison is taking ~1.1 minutes on a 2.8Ghz dual-core
box (which is a test box we use).
This obviously computes to ~21 minutes for 20k which is definitely not
where we want it headed. I believe, optimisation(or even different way
to address indexing inside dataframe) can be had inside the innermost
`if' and specifically in `cap$levels[i]=1;' but I am a bit at a loss
having scoured the documentation failing to find anything of value.
So, my question remains are there any general/specific changes I can
do to speed up the code execution dramatically?
Thanks folks.
--
Regards,
Ishwor Gurung
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list