[R] averaging a list of matrices element wise

Bert Gunter gunter.berton at gene.com
Mon Nov 5 18:30:51 CET 2012


Gents:

Well... (munch munch) I do think I must eat my words:

>z <- lapply(seq_len(1e6),function(x)matrix(runif(100),nr=10))

> system.time(Reduce("+",z)/length(z))
   user  system elapsed
   3.48    0.05    3.52
> system.time(rowMeans(array(unlist(z),dim=c(10,10,length(z))),dims=2))
   user  system elapsed
   5.09    0.53    5.65

At the very least the two are competitive, and in these sorts of data
(large list of small matrices) Reduce(0 may be slightly faster.

I leave it to those who are curious to check out other scenarios.

Cheers,
Bert


On Mon, Nov 5, 2012 at 7:12 AM, Bert Gunter <bgunter at gene.com> wrote:
> Gents:
>
> Although it is difficult to say what may be faster, as it typically
> depends on the data,  and it is even more difficult to say what is
> fast enough, I suspect that
>
> ?rowMeans ## specifically written for speed
>
> would be considerably faster than Reduce (or an apply() )approach on
> the array), but I have **not** checked. I am of course prepared to eat
> my words in the face of data to the contrary.
>
> The call would be:
>
> result <- rowMeans( array(unlist(raw), dim = c(r,s,length(raw)), dims=2)
>
> Note that rowMeans() has an na.rm arguments to handle NA's. See the
> help file for deatils.
> Note also the tradeoff to memory, as copies of raw probably are made
> during evaluation.
> Finally note that dimnames are lost in the final result, so the above
> would have to be followed by
>
> dimnames(result) <- dimnames(raw[[1]])
>
> to get them back.
>
> -- Bert
>
>
> On Mon, Nov 5, 2012 at 2:43 AM, D. Rizopoulos <d.rizopoulos at erasmusmc.nl> wrote:
>> If you don't have any NAs, then one way is:
>>
>> n <- 3
>> r <- 5
>> s <- 6
>> raw <- lapply(seq_len(n), function(i){
>>    matrix(rnorm(r * s), ncol = r)
>> })
>>
>> Reduce("+", raw) / length(raw)
>>
>>
>> I hope it helps.
>>
>> Best,
>> Dimitris
>>
>>
>> On 11/5/2012 11:32 AM, ONKELINX, Thierry wrote:
>>> Dear all,
>>>
>>> I have a list of n matrices which all have the same dimension (r x s). What would be a fast/elegant way to calculate the element wise average? So result[1, 1] <- mean(c(raw[[1]][1, 1] , raw[[2]][1, 1], raw[[...]][1, 1], raw[[n]][1, 1]))
>>>
>>> Here is my attempt.
>>>
>>> #create a dummy dataset
>>> n <- 3
>>> r <- 5
>>> s <- 6
>>> raw <- lapply(seq_len(n), function(i){
>>>    matrix(rnorm(r * s), ncol = r)
>>> })
>>>
>>> #do the calculation
>>> result <- array(dim = c(dim(raw[[1]]), length(raw)))
>>> for(i in seq_along(raw)){
>>>    result[,,i] <- raw[[i]]
>>> }
>>> result <- apply(result, 1:2, mean)
>>>
>>>
>>> Best regards,
>>>
>>> Thierry
>>>
>>> ir. Thierry Onkelinx
>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
>>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>>> Kliniekstraat 25
>>> 1070 Anderlecht
>>> Belgium
>>> + 32 2 525 02 51
>>> + 32 54 43 61 85
>>> Thierry.Onkelinx at inbo.be
>>> www.inbo.be
>>>
>>> To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
>>> ~ Sir Ronald Aylmer Fisher
>>>
>>> The plural of anecdote is not data.
>>> ~ Roger Brinner
>>>
>>> The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
>>> ~ John Tukey
>>>
>>> * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
>>> Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document.
>>> The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>> Dimitris Rizopoulos
>> Assistant Professor
>> Department of Biostatistics
>> Erasmus University Medical Center
>>
>> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
>> Tel: +31/(0)10/7043478
>> Fax: +31/(0)10/7043014
>> Web: http://www.erasmusmc.nl/biostatistiek/
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm




More information about the R-help mailing list