[R] replace values in array with means of certain values in array
Petr Savicky
savicky at praha1.ff.cuni.cz
Mon Jan 10 18:07:15 CET 2011
On Mon, Jan 10, 2011 at 03:21:18PM +0100, joke R wrote:
> Dear all,
>
> I have a problem with arrays.
> Simplified, I have two arrays:
>
> A =
> [,,1]
> 1 2 3
> 4 5 6
> 7 8 9
>
> [,,2]
> 10 11 12
> 13 14 15
> 16 17 18
>
> B=
> 1 1 2
> 1 1 2
> 3 3 2
>
> Basically, B declares to which cluster the values of A belong to. Now I
> want an array C where for each cluster of B, the mean of A is shown (the
> mean on the first 2 dimensions):
>
> C=
> [,,1]
> 3 3 6
> 3 3 6
> 7.5 7.5 6
> [,,2]
> 12 12 15
> 12 12 15
> 16.5 16.5 15
>
> The following short solution works:
>
> A <-
> aperm(array(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18),dim=c(3,3,2)),c(2,1,3))
> B <- matrix(c(1,1,2,1,1,2,3,3,2),nrow=3,byrow=TRUE)
> C <- array(NA,dim=c(3,3,2))
>
> for(a in 1:3){
> TS <- matrix(A[B==a],nrow=2,byrow=TRUE)
> C[B==a] <- rep(rowMeans(TS),each=sum(B==a))
> }
>
> But unfortunately, my dataset is a 64x64x64x120 array, and I have
> approximately 29000 clusters, and I'd have to repeat the process on 500
> simulated datasets.
> If I try the code above, it takes about 2 hours...
>
> Strange enough, it is the first line in the loop that takes so long:
> reconstructing a dataset with only the datapoints belonging to a certain
> cluster in it...
Let me suggest to consider the following approach, which avoids the
first line of the above code, computes the means in a different way and
puts them back using the second line of the above code.
First, transform A into a matrix, where the first two dimensions
represent the rows and the remaining dimensions represent the columns.
This does not reorder the elements of the array, only changes the
dimension information.
A1 <- matrix(A, nrow=3*3, ncol=2)
Compute the means
B1 <- c(B)
C1 <- rowsum(A1, group=B1)/c(table(B1))
Put the means to the required positions
for(a in 1:3){
C[B==a] <- rep(C1[a, ], each=sum(B==a))
}
I hope this can help.
Petr Savicky.
More information about the R-help
mailing list