[R] Aggregate matrix in a 2 by 2 manor
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Sat Jul 30 23:02:37 CEST 2016
For the record, the array.apply code can be fixed as below, but then it is slower than the expand.grid version.
aggregate.nx.ny.array.apply <- function(dta,nx=2,ny=2, FUN=mean,...)
{
a <- array(dta, dim = c(ny, nrow( dta ) %/% ny, nx, ncol( dta ) %/% nx))
apply( a, c(2, 4), FUN, ... )
}
--
Sent from my phone. Please excuse my brevity.
On July 30, 2016 11:06:16 AM PDT, "Anthoni, Peter (IMK)" <peter.anthoni at kit.edu> wrote:
>Hi all,
>
>thanks for the suggestions, I did some timing tests, see below.
>Unfortunately the aggregate.nx.ny.array.apply, does not produce the
>expected result.
>So the fastest seems to be the aggregate.nx.ny.expand.grid, though the
>double for loop is not that much slower.
>
>many thanks
>Peter
>
>> tst=matrix(1:(1440*360),ncol=1440,nrow=360)
>> system.time( {for(i in 1:10)
>tst_2x2=aggregate.nx.ny.forloop(tst,2,2,mean,na.rm=T)})
> user system elapsed
> 11.227 0.073 11.371
>> system.time( {for(i in 1:10)
>tst_2x2=aggregate.nx.ny.interaction(tst,2,2,mean,na.rm=T)})
> user system elapsed
> 26.354 0.475 26.880
>> system.time( {for(i in 1:10)
>tst_2x2=aggregate.nx.ny.expand.grid(tst,2,2,mean,na.rm=T)})
> user system elapsed
> 9.683 0.055 9.763
>> system.time( {for(i in 1:10)
>tst_2x2=aggregate.nx.ny.array.apply(tst,2,2,mean,na.rm=T)})
> user system elapsed
> 7.693 0.055 7.800
>
>> tst.small=matrix(1:(8*4),ncol=8,nrow=4)
>> aggregate.nx.ny.forloop = function(data,nx=2,ny=2, FUN=mean,...)
>+ {
>+ nlon=nrow(data)
>+ nlat=ncol(data)
>+ newdata=matrix(NA,nrow=nlon/nx,ncol=nlat/ny)
>+ dim(newdata)
>+ for(ilon in seq(1,nlon,nx)) {
>+ for(ilat in seq(1,nlat,ny)) {
>+ ilon_new=1+(ilon-1)/nx
>+ ilat_new=1+(ilat-1)/ny
>+ newdata[ilon_new,ilat_new] = FUN(data[ilon+0:1,ilat+0:1],...)
>+ }
>+ }
>+ newdata
>+ }
>> aggregate.nx.ny.forloop(tst.small)
> [,1] [,2] [,3] [,4]
>[1,] 3.5 11.5 19.5 27.5
>[2,] 5.5 13.5 21.5 29.5
>>
>> aggregate.nx.ny.interaction = function(data,nx=2,ny=2, FUN=mean,...)
>+ {
>+
>+ nlon=nrow(data)
>+ nlat=ncol(data)
>+ newdata=matrix(NA,nrow=nlon/nx,ncol=nlat/ny)
>+ newdata[] <- tapply( data, interaction( (row(data)+1) %/% 2,
>(col(data)+1) %/% 2 ), FUN, ...)
>+ newdata
>+ }
>> aggregate.nx.ny.interaction(tst.small)
> [,1] [,2] [,3] [,4]
>[1,] 3.5 11.5 19.5 27.5
>[2,] 5.5 13.5 21.5 29.5
>>
>> aggregate.nx.ny.expand.grid = function(data,nx=2,ny=2, FUN=mean,...)
>+ {
>+ ilon <- seq(1,ncol(data),nx)
>+ ilat <- seq(1,nrow(data),ny)
>+ cells <- as.matrix(expand.grid(ilat, ilon))
>+ blocks <- apply(cells, 1, function(x)
>data[x[1]:(x[1]+1),x[2]:(x[2]+1)])
>+ block.means <- colMeans(blocks)
>+ matrix(block.means, nrow(data)/ny, ncol(data)/nx)
>+ }
>> aggregate.nx.ny.expand.grid(tst.small)
> [,1] [,2] [,3] [,4]
>[1,] 3.5 11.5 19.5 27.5
>[2,] 5.5 13.5 21.5 29.5
>>
>> aggregate.nx.ny.array.apply = function(data,nx=2,ny=2, FUN=mean,...)
>{
>+ a <- array(data, dim = c(ny, nrow( data ) %/% ny, ncol( data ) %/%
>nx))
>+ apply( a, c(2, 3), FUN, ... )
>+ }
>> aggregate.nx.ny.array.apply(tst.small)
> [,1] [,2] [,3] [,4]
>[1,] 1.5 5.5 9.5 13.5
>[2,] 3.5 7.5 11.5 15.5
>
>
>
>> On 28 Jul 2016, at 00:26, David Winsemius <dwinsemius at comcast.net>
>wrote:
>>
>>
>>> On Jul 27, 2016, at 12:02 PM, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us> wrote:
>>>
>>> An alternative (more compact, not necessarily faster, because apply
>is still a for loop inside):
>>>
>>> f <- function( m, nx, ny ) {
>>> # redefine the dimensions of my
>>> a <- array( m
>>> , dim = c( ny
>>> , nrow( m ) %/% ny
>>> , ncol( m ) %/% nx )
>>> )
>>> # apply mean over dim 1
>>> apply( a, c( 2, 3 ), FUN=mean )
>>> }
>>> f( tst, nx, ny )
>>
>> Here's an apparently loopless strategy, although I suspect the code
>for interaction (and maybe tapply as well?) uses a loop.
>>
>>
>> tst_2X2 <- matrix(NA, ,ncol=4,nrow=2)
>>
>> tst_2x2[] <- tapply( tst, interaction( (row(tst)+1) %/% 2,
>(col(tst)+1) %/% 2 ), mean)
>>
>> tst_2x2
>>
>> [,1] [,2] [,3] [,4]
>> [1,] 3.5 11.5 19.5 27.5
>> [2,] 5.5 13.5 21.5 29.5
>>
>> --
>> David.
>>
>>
>>>
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On July 27, 2016 9:08:32 AM PDT, David L Carlson <dcarlson at tamu.edu>
>wrote:
>>>> This should be faster. It uses apply() across the blocks.
>>>>
>>>>> ilon <- seq(1,8,nx)
>>>>> ilat <- seq(1,4,ny)
>>>>> cells <- as.matrix(expand.grid(ilat, ilon))
>>>>> blocks <- apply(cells, 1, function(x) tst[x[1]:(x[1]+1),
>>>> x[2]:(x[2]+1)])
>>>>> block.means <- colMeans(blocks)
>>>>> tst_2x2 <- matrix(block.means, 2, 4)
>>>>> tst_2x2
>>>> [,1] [,2] [,3] [,4]
>>>> [1,] 3.5 11.5 19.5 27.5
>>>> [2,] 5.5 13.5 21.5 29.5
>>>>
>>>> -------------------------------------
>>>> David L Carlson
>>>> Department of Anthropology
>>>> Texas A&M University
>>>> College Station, TX 77840-4352
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: R-help [mailto:r-help-bounces at r-poject.org] On Behalf Of
>Anthoni,
>>>> Peter (IMK)
>>>> Sent: Wednesday, July 27, 2016 6:14 AM
>>>> To: r-help at r-project.org
>>>> Subject: [R] Aggregate matrix in a 2 by 2 manor
>>>>
>>>> Hi all,
>>>>
>>>> I need to aggregate some matrix data (1440x720) to a lower
>dimension
>>>> (720x360) for lots of years and variables
>>>>
>>>> I can do double for loop, but that will be slow. Anybody know a
>quicker
>>>> way?
>>>>
>>>> here an example with a smaller matrix size:
>>>>
>>>> tst=matrix(1:(8*4),ncol=8,nrow=4)
>>>> tst_2x2=matrix(NA,ncol=4,nrow=2)
>>>> nx=2
>>>> ny=2
>>>> for(ilon in seq(1,8,nx)) {
>>>> for (ilat in seq(1,4,ny)) {
>>>> ilon_2x2=1+(ilon-1)/nx
>>>> ilat_2x2=1+(ilat-1)/ny
>>>> tst_2x2[ilat_2x2,ilon_2x2] = mean(tst[ilat+0:1,ilon+0:1])
>>>> }
>>>> }
>>>>
>>>> tst
>>>> tst_2x2
>>>>
>>>>> tst
>>>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
>>>> [1,] 1 5 9 13 17 21 25 29
>>>> [2,] 2 6 10 14 18 22 26 30
>>>> [3,] 3 7 11 15 19 23 27 31
>>>> [4,] 4 8 12 16 20 24 28 32
>>>>
>>>>> tst_2x2
>>>> [,1] [,2] [,3] [,4]
>>>> [1,] 3.5 11.5 19.5 27.5
>>>> [2,] 5.5 13.5 21.5 29.5
>>>>
>>>>
>>>> I though a cast to 3d-array might do the trick and apply over the
>new
>>>> dimension, but that does not work, since it casts the data along
>the
>>>> row.
>>>>> matrix(apply(array(tst,dim=c(nx,ny,8)),3,mean),nrow=nrow(tst)/ny)
>>>> [,1] [,2] [,3] [,4]
>>>> [1,] 2.5 10.5 18.5 26.5
>>>> [2,] 6.5 14.5 22.5 30.5
>>>>
>>>>
>>>> cheers
>>>> Peter
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius
>> Alameda, CA, USA
>>
More information about the R-help
mailing list