[R] code optimization problem ... using or not using "which" function

Juan Carlos Laguardia brassman785 at gmail.com
Sat May 30 01:17:23 CEST 2009


hello all,

I have two data sets that share certain fields of of interest (
facility, unit, date) which I want to match up, and from this extract
information from one dataset and store it in the other.

my first initial idea  (which I know is bad) goes like this:

##  capacity  and new_trayloc are datasets in example code:

for( i in 1: nrow( new_trayloc) {


theshifts<-which(as.Date(capacity$shift_dt) == new_trayloc$admit_dt[i] &
      as.character(capacity$unit)==as.character(new_trayloc$UNIT_1[i]) &
      as.character(capacity$fac_id)==as.character(new_trayloc$ORIG_FAC_ID[i]))


thenightshifts<-which(as.Date(capacity$shift_dt) == new_trayloc$admit_dt[i]-1 &
      as.character(capacity$unit)==as.character(new_trayloc$UNIT_1[i]) &
      as.character(capacity$fac_id)==as.character(new_trayloc$ORIG_FAC_ID[i]))


..... obtain information by using theshifts and thenightshifts objects
and store in new_trayloc

}

. by doing a system.time on the entire for loop for 5 iterations, i
get a time of
 user  system elapsed
  25.66    1.04   26.72

That seems really bad... and plus, i need to run it for over 100,000 iterations.

Any suggestions in either the way I match the fields, or my approach
to my problem?


Cheers,
Juan Carlos




More information about the R-help mailing list