[R] how to get values within a threshold

arun smartpink111 at yahoo.com
Fri Sep 13 15:50:30 CEST 2013


Hi,

Some speed comparison


set.seed(434)
 val1<- rnorm(1e5)
 set.seed(28)
 thresh1<- sample(1:20,1e2,replace=TRUE)
 system.time(res<- rowSums(t(replicate(length(thresh1),val1))<= thresh1))
#  user  system elapsed 
#  0.320   0.064   0.382 
system.time(res2<- sapply(thresh1,function(x) {sum(val1<x)}))
#  user  system elapsed 
#  0.088   0.004   0.093 
system.time(res3<- rowSums(matrix(rep(val1,length(thresh1)),nrow=length(thresh1),byrow=TRUE)<=thresh1))
# user  system elapsed 
#  0.228   0.048   0.275 

system.time(res4<- sapply(1:length(thresh1),function(x){length(val1[val1<thresh1[x]])})) 
# user  system elapsed 
#  0.300   0.044   0.345 

mat1<- matrix(rep(val1,length(thresh1)),nrow=length(thresh1),byrow=TRUE)
system.time(res5<- rowSums(mat1<=thresh1))
# user  system elapsed 
# 0.104   0.000   0.103 
system.time(res6<- unlist(lapply(thresh1,function(x) {sum(val1<x)})))
 #  user  system elapsed 
 # 0.088   0.000   0.088 



identical(res,as.numeric(res2))
#[1] TRUE
 identical(res,res3)
#[1] TRUE

identical(res,as.numeric(res4))
#[1] TRUE


identical(res,as.numeric(res6))
#[1] TRUE

A.K.

----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: Zhang Weiwu <zhangweiwu at realss.com>
Cc: R help <r-help at r-project.org>
Sent: Friday, September 13, 2013 9:27 AM
Subject: Re: [R] how to get values within a threshold

Hi,
You could try:
val1<- c(0.854400, 1.648465, 1.829830, 1.874704, 7.670915, 7.673585, 7.722619)
thresh1<- c(1,3,5,7,9)
rowSums(t(replicate(length(thresh1),val1))<= thresh1)
#[1] 1 4 4 4 7

#using ?sapply() could be shortened
sapply(thresh1,function(x) {sum(val1<x)})
#[1] 1 4 4 4 7


A.K.




----- Original Message -----
From: Zhang Weiwu <zhangweiwu at realss.com>
To: r-help at r-project.org
Cc: 
Sent: Friday, September 13, 2013 6:13 AM
Subject: [R] how to get values within a threshold


input:

    > values
    [1] 0.854400 1.648465 1.829830 1.874704 7.670915 7.673585 7.722619

    > thresholds
    [1] 1 3 5 7 9

expected output:

    [1] 1 4 4 4 7

That is, need a vector of indexes of the maximum value below the threshold.

e.g.
First  element is "1", because value[1] is the largest below threshold "1".
Second element is "4", because value[4] is the largest below threshold "3".

The way I do it is:

> sapply(1:length(threshold), function(x) { length(values[values < threshold[x]])})
[1] 1 4 4 4 7

It just seem to me too long and stupid to be like R. Is it already the best way?

Somehow I feel which() was designed for a purpose like this, but I couldn't 
figure out a way to apply which here.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list