[R] code optimization problem ... using or not using "which"function

krzysztof.sakrejda at gmail.com krzysztof.sakrejda at gmail.com
Sat May 30 05:03:49 CEST 2009


Why not use the 'merge' function?

Krzysztof

Sent via BlackBerry by AT&T

-----Original Message-----
From: jim holtman <jholtman at gmail.com>

Date: Fri, 29 May 2009 20:55:12 
To: Juan Carlos Laguardia<brassman785 at gmail.com>
Cc: <r-help at r-project.org>
Subject: Re: [R] code optimization problem ... using or not using "which"
	function


For a start, do all your conversions to character and Date once outside the
loop so you are not doing them for each iteration.  Not exactly sure what
you are doing, but it looks like with the 'and's you are only checking for
the rows that are the same.  You might want to use a 'match' function like:

x <- match(capacity$shift_dt, new_trayloc$admin)

to get where each of the items match and then when you have done it for the
three conditions, you then find columns that have the same number indicating
all condition match for that row.

On Fri, May 29, 2009 at 7:17 PM, Juan Carlos Laguardia <
brassman785 at gmail.com> wrote:

> hello all,
>
> I have two data sets that share certain fields of of interest (
> facility, unit, date) which I want to match up, and from this extract
> information from one dataset and store it in the other.
>
> my first initial idea  (which I know is bad) goes like this:
>
> ##  capacity  and new_trayloc are datasets in example code:
>
> for( i in 1: nrow( new_trayloc) {
>
>
> theshifts<-which(as.Date(capacity$shift_dt) == new_trayloc$admit_dt[i] &
>      as.character(capacity$unit)==as.character(new_trayloc$UNIT_1[i]) &
>
>  as.character(capacity$fac_id)==as.character(new_trayloc$ORIG_FAC_ID[i]))
>
>
> thenightshifts<-which(as.Date(capacity$shift_dt) ==
> new_trayloc$admit_dt[i]-1 &
>      as.character(capacity$unit)==as.character(new_trayloc$UNIT_1[i]) &
>
>  as.character(capacity$fac_id)==as.character(new_trayloc$ORIG_FAC_ID[i]))
>
>
> ..... obtain information by using theshifts and thenightshifts objects
> and store in new_trayloc
>
> }
>
> . by doing a system.time on the entire for loop for 5 iterations, i
> get a time of
>  user  system elapsed
>  25.66    1.04   26.72
>
> That seems really bad... and plus, i need to run it for over 100,000
> iterations.
>
> Any suggestions in either the way I match the fields, or my approach
> to my problem?
>
>
> Cheers,
> Juan Carlos
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list