[R] Matrix max by row

Bert Gunter gunter.berton at gene.com
Mon Mar 30 17:24:36 CEST 2009


 
Serves me right, I suppose. Timing seems also very dependent on the
dimensions of the matrix. Here's what I got with my inadequate test:

> x <- matrix(rnorm(3e5),ncol=3)
## via apply
> system.time(apply(x,1,max))
   user  system elapsed 
   2.09    0.02    2.10

## via pmax 
> system.time(do.call(pmax,data.frame(x)))
   user  system elapsed 
   0.10    0.02    0.11 
>

Draw your own conclusions!

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
650-467-7374

-----Original Message-----
From: Wacek Kusnierczyk [mailto:Waclaw.Marcin.Kusnierczyk at idi.ntnu.no] 
Sent: Monday, March 30, 2009 2:33 AM
To: Rolf Turner
Cc: Bert Gunter; 'Ana M Aparicio Carrasco'; r-help at r-project.org
Subject: Re: [R] Matrix max by row

Rolf Turner wrote:
> I tried the following:
>
> m <- matrix(runif(100000),1000,100)
> junk <- gc()
> print(system.time(for(i in 1:100) X1 <- do.call(pmax,data.frame(m))))
> junk <- gc()
> print(system.time(for(i in 1:100) X2 <- apply(m,1,max)))
>
> and got
>
>    user  system elapsed
>   2.704   0.110   2.819
>    user  system elapsed
>   1.938   0.098   2.040
>
> so unless there's something that I am misunderstanding (always a serious
> consideration) Wacek's apply method looks to be about 1.4 times
> *faster* than
> the do.call/pmax method.


hmm, since i was called by name (i'm grateful, rolf), i feel obliged to
check the matters myself:

    # dummy data, presumably a 'large matrix'?
    n = 5e3
    m = matrix(rnorm(n^2), n, n)

    # what is to be benchmarked...
    waku = expression(matrix(apply(m, 1, max), nrow(m)))
    bert = expression(do.call(pmax,data.frame(m)))

    # to be benchmarked
    library(rbenchmark)
    benchmark(replications=10, order='elapsed', columns=c('test',
'elapsed'),
       waku=matrix(apply(m, 1, max), nrow(m)),
       bert=do.call(pmax,data.frame(m)))

takes quite a while, but here you go:

    #   test elapsed
    # 1 waku  11.838
    # 2 bert  20.833

where bert's solution seems to require a wonder to 'be considerably
faster for large matrices'.  to have it fair, i also did

    # to be benchmarked
    library(rbenchmark)
    benchmark(replications=10, order='elapsed', columns=c('test',
'elapsed'),
       bert=do.call(pmax,data.frame(m)),
       waku=matrix(apply(m, 1, max), nrow(m)))

    #  test elapsed
    # 2 waku  11.695
    # 1 bert  20.912
   
take home point: a good product sells itself, a bad product may not sell
despite aggressive marketing.

rolf, thanks for pointing this out.

cheers,
vQ


>     cheers,
>
>         Rolf Turner
>
>
> On 30/03/2009, at 3:55 PM, Bert Gunter wrote:
>
>> If speed is a consideration,availing yourself of the built-in pmax()
>> function via
>>
>> do.call(pmax,data.frame(yourMatrix))
>>
>> will be considerably faster for large matrices.
>>
>> If you are puzzled by why this works, it is a useful exercise in R to
>> figure
>> it out.
>>
>> Hint:The man page for ?data.frame says:
>> "A data frame is a list of variables of the same length with unique row
>> names, given class 'data.frame'."
>>
>> Cheers,
>> Bert
>>
>> Bert Gunter
>> Genentech Nonclinical Statistics
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
>> [mailto:r-help-bounces at r-project.org] On
>> Behalf Of Wacek Kusnierczyk
>> Sent: Saturday, March 28, 2009 5:22 PM
>> To: Ana M Aparicio Carrasco
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Matrix max by row
>>
>> Ana M Aparicio Carrasco wrote:
>>> I need help about how to obtain the max by row in a matrix.
>>> For example if I have the following matrix:
>>> 2 5 3
>>> 8 7 2
>>> 1 8 4
>>>
>>> The max by row will be:
>>> 5
>>> 8
>>> 8
>>>
>>
>> matrix(apply(m, 1, max), nrow(m))
>>
>> vQ
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> ######################################################################
> Attention:This e-mail message is privileged and confidential. If you
> are not theintended recipient please delete the message and notify the
> sender.Any views or opinions presented are solely those of the author.
>
> This e-mail has been scanned and cleared by
> MailMarshalwww.marshalsoftware.com
> ######################################################




More information about the R-help mailing list