[Rd] Suggestion: Adding quick rowMin and rowMax functions to base package

TakeoKatsuki takeo.katsuki at gmail.com
Sun Feb 13 19:18:31 CET 2011


Hi Henrik,

It would be nice if functions of the matrixStats package can handle array
data.
For example, rowSums() of the base package sums along the third axis of an
array by rowSums(x, dim=2).
Thanks.

Takeo


Henrik Bengtsson wrote:
> 
> See rowMins(), rowMaxs() and rowRanges() in matrixStats (on CRAN).
> 
> The matrixStats package was created for the purpose of providing such
> row*/col*() methods.  First the functionality is provided, then the
> methods are optimized for speed and memory, e.g. vectorizing,
> implementing in native code, and utilizing other fast existing
> functions.  Some methods have already been optimized this way.  When
> mature, these may be suggested to be part of the default R
> distribution.
> 
> Benchmarking reports, and contributions of code and redundancy are
> welcome.  Testing the code under many different conditions is
> critical, e.g. missing values or not, infinite values or not, zero,
> one or many columns/rows, ...
> 
> /Henrik
> 
> PS. The rowMaxs() etc does not utilize pmax(); didn't know of it.
> 
> 
> On Mon, Mar 29, 2010 at 9:34 PM, Sebastian Kranz <skranz at uni-bonn.de>
> wrote:
>> Hi,
>>
>> I wonder whether similarly to the very quick rowSums and colSums
>> functions
>> in the base package, one could add quick functions that calculate the min
>> or
>> max over rows / cols in a matrix. While apply(x,1,min) works, I found out
>> by
>> profiling a program of mine that it is rather slow for matrices with a
>> very
>> large number of rows. A quick functionality seems to be already there in
>> the
>> functions pmax and pmin, but it is rather cumbersume to apply them to all
>> columns of a matrix (if one does not know how many columns the matrix
>> has).
>>  Below, I have some code that shows a very unelegant implementation that
>> illustrates possible speed gains if apply could be avoided:
>>
>> rowMin = function(x) {
>>   # Construct a call pmin(x[,1],x[,2],...x[,NCOL(x)])
>>    code = paste("x[,",1:(NCOL(x)),"]",sep="",collapse=",")
>>    code = paste("pmin(",code,")")
>>    return(eval(parse(text=code)))
>> }
>>
>> # Speed comparison: Taking rowMin of a 1,000,000 x 10 matrix
>> x = matrix(rnorm(1e7),1e6,10)
>>
>> # The traditional apply method
>> y=apply(x,1,min) # Runtime ca. 12 seconds
>>
>> # My unelegant rowMin function
>> z=rowMin(x) # Runtime ca 0.5 seconds
>>
>> Of course, the way the function rowMin is constructed is highly
>> ineffective
>> if the matrix x has many columns, but maybe there is a simple way to
>> adapt
>> the code from pmin and pmax to create quick rowMin, rowMax,... functions.
>> I
>> don't know whether it is worth the effort, but I guess taking minima and
>> maxima over rows is a common task.
>>
>> Best wishes,
>> Sebastian
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 

-- 
View this message in context: http://r.789695.n4.nabble.com/Suggestion-Adding-quick-rowMin-and-rowMax-functions-to-base-package-tp1744761p3303893.html
Sent from the R devel mailing list archive at Nabble.com.



More information about the R-devel mailing list