[R] efficient rolling rank

Whit Armstrong armstrong.whit at gmail.com
Sun Apr 18 02:27:14 CEST 2010


> library(fts)
> x <- fts(data=rnorm(1e6))
> system.time(xrnk <- moving.rank(x,500))
   user  system elapsed
   0.68    0.00    0.68


you will have to disguise your data as a time series to use fts.

see below the exact implementation of rank that is used.

-Whit

  template<typename ReturnType>
  class Rank {
  public:
    template<typename T>
    static inline ReturnType apply(T beg, T end) {
      ReturnType ans = 1;
      typename std::iterator_traits<T>::value_type rank_value = *(end - 1);
      while(beg != (end - 1)) {
	if(numeric_traits<typename std::iterator_traits<T>::value_type>::ISNA(*beg)) {
	  return numeric_traits<ReturnType>::NA();
	}

	// if end > data[index] then increment it's rank
	ans += (rank_value > *beg ? 1 : 0);
	++beg;
      }
      return ans;
    }
  };



On Sat, Apr 17, 2010 at 12:00 AM, Charles C. Berry <cberry at tajo.ucsd.edu> wrote:
> On Fri, 16 Apr 2010, zerdna wrote:
>
>>
>> Could someone give me an idea on how to do rolling ranking, i.e. rank in
>> the
>> moving window of last 100 numbers in a long vector? I tried naive solution
>> like
>>
>> roll.rank<-function(v, len){
>>   r<-numeric(length(v)-len+1)
>>   for(i in len:length(v))
>>       r[i-len+1]<-rank(v[(i-len+1):i])[len]
>>   r
>>
>> }
>>
>> However, it turns out pretty slow even on my rather able Linux box. For
>> example, doing roll.rank(rnorm(50000), 100) takes 5 second, so for typical
>> data i operate which is matrices of the size 1000 x 50000 i will need to
>> wait 1.5 hours for one calculation. Does someone know a trick to properly
>> do
>> it quicker?
>
> Vectorize it with embed:
>
>> x <- rnorm(50000)
>> system.time(x.rank <- rowSums(x[ -(1:99) ] >= embed(x,100) ))
>
>   user  system elapsed
>  0.295   0.131   0.424
>>
>> system.time(x.rank.2 <- roll.rank(x,100))
>
>   user  system elapsed
>  6.907   0.033   6.940
>>
>> all.equal(x.rank,x.rank.2)
>
> [1] TRUE
>>
>
> N.B., if there are ties, you may want to adjust when
>
>        rowSums(x[ -(1:99) ] == embed(x,100) )
>
> is greater than 1.
>
> If you want much faster, package inline would enable you to write something
> equivalent in C.
>
> HTH,
>
> Chuck
>
>> --
>> View this message in context:
>> http://n4.nabble.com/efficient-rolling-rank-tp2013535p2013535.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> Charles C. Berry                            (858) 534-2098
>                                            Dept of Family/Preventive
> Medicine
> E mailto:cberry at tajo.ucsd.edu               UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list