[Rd] arraytake for extracting subarrays from multidimensional arrays
Robin Hankin
r.hankin at noc.soton.ac.uk
Thu Oct 19 15:40:59 CEST 2006
On 19 Oct 2006, at 14:26, Gabor Grothendieck wrote:
> Note that it can also be done like with do.call:
>
> a <- array(1:24, 2:4)
> L <- list(TRUE, 1:3, c(4, 2))
> do.call("[", c(list(a), L))
>
aargggh, you beat me to it. I didn't think to pass TRUE to "[" .
I'll stick it in the package with joint attribution to Gabor and Balaji
and document it with apltake() and apldrop().
best wishes all
rksh
> On 10/19/06, Balaji S. Srinivasan <balajis at stanford.edu> wrote:
>> Hi,
>>
>> I recently encountered a problem with array subsetting and came up
>> with a
>> fix. Given an array of arbitrary dimensions, in which the number of
>> dimensions is only known at runtime, I wanted to extract a
>> subarray. The
>> main issue with doing this is that in order to extract a subarray
>> from an
>> array of (say) 4 dimensions you usually specify something like this
>>
>> a.subarray <- a[,c(4,2),1:5,]
>>
>> However, if your code needs to handle an array with an arbitrary
>> number of
>> dimensions then you can't hard code the number of commas while
>> writing the
>> code. (Regarding motivation, the reason this came up is because I
>> wanted to
>> do some toy problems involving conditioning on multiple variables
>> in a
>> multidimensional joint pmf.)
>>
>> I looked through commands like slice.index and so on, but they
>> seemed to
>> require reshaping and big logical matrix intermediates, which were
>> not
>> memory efficient enough for me. apltake in the magic package was
>> the closest
>> but it only allowed subsetting of contiguous indices from either
>> the first
>> or last element in any given dimension. It was certainly possible
>> to call
>> apltake multiple times to extract arbitrary subarrays via
>> combinations of
>> index intervals for each dimension, and then combine them with
>> abind as
>> necessary, but this did not seem elegant.
>>
>> Anyway, I then decided to simply generate code with parse and
>> eval. I found
>> this post by Henrik Bengtsson which had the same idea:
>>
>> http://tolstoy.newcastle.edu.au/R/devel/05/11/3266.html
>>
>> I just took that code one step further and put together a utility
>> function
>> that I think might be fairly useful. I haven't completely
>> robustified it
>> against all kinds of pathological inputs, but if there is any
>> interest from
>> the development team it would be nice to add an error-checked
>> version of
>> this to R (or I guess I could keep it in a package).
>>
>>
>> Simple usage example:
>> ------
>>> source("arraytake.R")
>>> a <- array(1:24,c(2,3,4))
>>
>>> a[,1:3,c(4,2)] ##This invocation requires hard coding the number of
>> dimensions of a
>> , , 1
>>
>> [,1] [,2] [,3]
>> [1,] 19 21 23
>> [2,] 20 22 24
>>
>> , , 2
>>
>> [,1] [,2] [,3]
>> [1,] 7 9 11
>> [2,] 8 10 12
>>
>>
>>> arraytake(a,list(NULL,1:3,c(4,2))) ##This invocation does not, and
>> produces the same result
>> , , 1
>>
>> [,1] [,2] [,3]
>> [1,] 19 21 23
>> [2,] 20 22 24
>>
>> , , 2
>>
>> [,1] [,2] [,3]
>> [1,] 7 9 11
>> [2,] 8 10 12
>>
>>
>>
>> Code below:
>> --------
>> arraytake <- function(x,indlist) {
>>
>> #Returns subarrays of arbitrary dimensioned arrays
>> #1) Let x be a multidimensional array with an arbitrary number of
>> dimensions.
>> #2) Let indlist be a list of vectors. The length of indlist is
>> the same as
>> the number of
>> #dimensions in x. Each element of the indlist is a vector which
>> specifies
>> which
>> #indexes to extract in the corresponding dimension. If the
>> element of the
>> indlist is
>> #NULL, then we return all elements in that dimension.
>>
>> #The main way this works is by programmatically building up a comma
>> separated argument to "[" as a string
>> #and then simply evaluating that expression. This way one does
>> not need to
>> specify the number of
>> #commas.
>>
>> if(length(dim(x)) != length(indlist)) {
>> return(); #we would put some error message here in production
>> code
>> }
>>
>> #First build up a string w/ indices for each dimension
>> d <- length(indlist); #number of dims
>> indvecstr <- matrix(0,d,1);
>> for(i in 1:d) {
>> if(is.null(indlist[[i]])) {
>> indvecstr[i] <- "";
>> } else{
>> indvecstr[i] <-
>> paste("c(",paste(indlist[[i]],sep="",collapse=","),")",sep="")
>> }
>> }
>>
>> #Then build up the argument string to "["
>> argstr <- paste(indvecstr,sep="",collapse=",")
>> argstr <- paste("x[",argstr,"]",sep="")
>>
>> #Finally, return the subsetted array
>> return(eval(parse(text=argstr)))
>> }
>>
>>
>>
>>
>>
>>
>>
>> --
>> Dr. Balaji S. Srinivasan
>> Stanford University
>> Depts. of Statistics and Computer Science
>> 318 Campus Drive, Clark Center S251
>> (650) 380-0695
>> balajis at stanford.edu
>> http://jinome.stanford.edu
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
tel 023-8059-7743
More information about the R-devel
mailing list