[R] vectorization condition counting
William Dunlap
wdunlap at tibco.com
Sat Aug 11 02:02:28 CEST 2012
Your sum(tag_id==tag_id[i])==1, meaning tag_id[i] is the only entry with its
value, may be vectorized by the sneaky idiom
!(duplicated(tag_id,fromLast=FALSE) | duplicated(tag_id,fromLast=TRUE)
Hence f0() (with your code in a loop) and f1() are equivalent:
f0 <- function (tags) {
for (i in seq_len(nrow(tags))) {
if (sum(tags$tag_id == tags$tag_id[i]) == 1 & tags$lgth[i] < 300) {
tags$stage[i] <- "J"
}
}
tags
}
f1 <-function (tags) {
needsChanging <- with(tags, !(duplicated(tag_id, fromLast = FALSE) |
duplicated(tag_id, fromLast = TRUE)) & lgth < 300)
tags$stage[needsChanging] <- "J"
tags
}
E.g.,
> someTags <- data.frame(tag_id = c(1, 2, 2, 3, 4, 5, 6, 6), lgth = 50*(1:8), stage=factor(rep(".",8), levels=c(".","J")))
> all.equal(f0(someTags), f1(someTags))
[1] TRUE
> f1(someTags)
tag_id lgth stage
1 1 50 J
2 2 100 .
3 2 150 .
4 3 200 J
5 4 250 J
6 5 300 .
7 6 350 .
8 6 400 .
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Guillaume2883
> Sent: Friday, August 10, 2012 3:47 PM
> To: r-help at r-project.org
> Subject: [R] vectorization condition counting
>
> Hi all,
>
> I am working on a really big dataset and I would like to vectorize a
> condition in a if loop to improve speed.
>
> the original loop with the condition is currently writen as follow:
>
> if(sum(as.integer(tags$tag_id==tags$tag_id[i]))==1&tags$lgth[i]<300){
>
> tags$stage[i]<-"J"
>
> }
>
> Do you have some ideas ? I was unable to do it correctly
> Thanking you in advance for your help
>
> Guillaume
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/vectorization-condition-
> counting-tp4639992.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list