[Rd] difficulties with setMethod("[" and ...

Tue May 18 19:02:31 CEST 2010

On Tue, 18 May 2010 10:22:03 +0200, Martin Maechler
<maechler at stat.math.ethz.ch> wrote:
>>>>>> Tony Plate <tplate at acm.org>
>>>>>>     on Mon, 17 May 2010 20:51:12 -0600 writes:
> 
>     > Jim, yes, I have dealt with that particular challenge that
list(...) 
>     > throws an error for a call like f(x,,,) where the empty args match
>     > to a
>     > ... formal argument.   Here's some fragments of code that I used
to
>     > cope
>     > with this:
> 
>     > # to find the empty anon args, must work with the unevaluated dot
>     > args
>     > dot.args.uneval <- match.call(expand.dots=FALSE)$...
>     > if (length(dot.args.uneval))
>     > missing.dot.args <- sapply(dot.args.uneval, function(arg) 
>     > is.symbol(arg) && as.character(arg)=="")
>     > else
>     > missing.dot.args <- logical(0)
>     > ...
>     > # Now we can work with evaluated dot args.
>     > # Can't do dot.args <- list(...) because that will
>     > # stop with an error for missing args.
>     > dot.args <- mapply(dot.args.uneval, missing.dot.args, 
>     > FUN=function(arg, m) if (!m) eval(arg) else NULL)
> 
> I don't have much time at the moment, to delve into Jim's code,
> nor to analyze what exactly Tony's does.
> 
> Some notes however which I deem important:
> 
> 1) My experiece in writing many S4 methods for "["  -- with the
>    Matrix package, but also for 'Rmpfr' -- is that you 
>    really need to work  with  nargs()
>    rather than with things like  length(list(...))
> 
> 2) If you really want to be compatible to the very rich
>    semantics of S and R subsetting, you need to spend more time
>    than you anticipate.
> 
>    - negative subscripts, names, logicals
>    
>    - A[i]  for an array A   where i can be a vector and then the
>      array is treated as if it had no dim() attribute
>    - A[i]  for an array A   where i is a *matrix* with k columns
>      	   where  k <- length(dim(A))  --- (k = 2 for matrices)
>    - A[]
>    ....
> 

I might not get all of the way, but random access and logical subsetting
seem very doable. 

>   Are you sure you would not try to use
>     setClass('myExample', contains = "array", representation = ...)
>   rather than your
>     setClass('myExample', representation(x = "array", ...))
>   ?
>   You would get all the "[" (and other array methods) for free,
>   and would only need to specify those methods where 'myExample'
>   really differed from array-subsetting.

My example is completely contrived. My back end are hdf5 files and so the
'[' method calls C functions, which at the moment grab contiguous blocks. 

> 
> 3) Lots of well-tested    setMethod("[", ....)  examples
>    are in the sources of the Matrix package.
> 
>    There, BTW, I found it useful to use
> 
>   ## for 'i' in x[i] or A[i,] : (numeric = {double, integer})
>   setClassUnion("index", members =  c("numeric", "logical",
"character"))
> 
>  and then, e.g.,  a simple example method ..
> 
>   setMethod("[", signature(x = "denseMatrix", i = "index", j =
"missing",
> 			   drop = "logical"),
> 	    function (x, i, j, ..., drop) {
> 		if((na <- nargs()) == 3)
> 		    r <- as(x, "matrix")[i, drop=drop]
> 		else if(na == 4)
> 		    r <- as(x, "matrix")[i, , drop=drop]
> 		else stop("invalid nargs()= ",na)
> 		if(is.null(dim(r))) r else as(r, geClass(x))
> 	    })
>    
>   The examples in the "Rmpfr" package are much less and simpler.
> 
>   To find the methods, for both, use  
>       fgrep 'setMethod("["' R/*R
>   if you are on a decent OS and in side the package source directory.
>  

Thanks, I had briefly scanned the matrix package - and the pointer to
nargs is extremely helpful.

thanks again, jim

> --
> Martin Maechler, ETH Zurich
> 
>     > Let me know if you need any further explanation.
> 
>     > Several warnings:
>     > * I was using this code with S3 generics and methods.
>     > * There are quite possibly better ways of detecting empty
>     > unevaluated
>     > arguments than 'is.symbol(arg) && as.character(arg)==""'.
>     > * You'll probably want to be careful that the eval() in the last
>     > line is
>     > using the appropriate environment for your application.
> 
>     > I didn't read your code in detail, so apologies if the above is 
>     > off-the-point, but your verbal description of the problem and the
>     > coding
>     > style and comments in the "[" method for "myExample" triggered my
>     > memory.
> 
>     > -- Tony Plate
> 
>     > On 05/17/2010 07:48 PM, James Bullard wrote:
>     >> Apologies if I am not understanding something about how things
are
>     >> being
>     >> handled when using S4 methods, but I have been unable to find an
>     >> answer to
>     >> my problem for some time now.
>     >> 
>     >> Briefly, I am associating the generic '[' with a class which I
wrote
>     >> (here: myExample). The underlying back-end allows me to read
>     >> contiguous
>     >> slabs, e.g., 1:10, but not c(1, 10). I want to shield the user
from
>     >> this
>     >> infelicity, so I grab the slab and then subset in memory. The
main
>     >> problem
>     >> is with datasets with dim(.)>  2. In this case, the '...'
argument
>     >> doesn't
>     >> seem to be in a reasonable state. When it is indeed missing then
it
>     >> properly reports that fact, however, when it is not missing it
>     >> reports
>     >> that it is not missing, but then the call to: list(...) throws an
>     >> argument
>     >> is missing exception.
>     >> 
>     >> I cannot imagine that this has not occurred before, so I am
>     >> expecting
>     >> someone might be able to point me to some example code. I have
>     >> attached
>     >> some code demonstrating my general problem ((A) and (B) below) as
>     >> well as
>     >> the outline of the sub-selection code. I have to say that coding
>     >> this has
>     >> proven non-trivial and any thoughts on cleaning up the mess are
>     >> welcome.
>     >> 
>     >> As always, thanks for the help.
>     >> 
>     >> Jim
>     >> 
>     >> require(methods)
>     >> 
>     >> setClass('myExample', representation = representation(x =
"array"))
>     >> 
>     >> myExample<- function(dims = c(1,2)) {
>     >> a<- array(rnorm(prod(dims)))
>     >> dim(a)<- dims
>     >> obj<- new("myExample")
>     >> obj at x<- a
>     >> return(obj)
>     >> }
>     >> 
>     >> setMethod("dim", "myExample", function(x) return(dim(x at x)))
>     >> 
>     >> functionThatCanOnlyGrabContiguous<- function(x, m, kall) {
>     >> kall$x<- x at x
>     >> for (i in 1:nrow(m)) {
>     >> kall[[i+2]]<- seq.int(m[i,1], m[i,2])
>     >> }
>     >> print(as.list(kall))
>     >> return(eval(kall))
>     >> }
>     >> 
>     >> setMethod("[", "myExample", function(x, i, j, ..., drop = TRUE) {
>     >> if (missing(...)){
>     >> print("Missing!")
>     >> }
>     >> e<- list(...)
>     >> m<- matrix(nrow = length(dim(x)), ncol = 2)
>     >> 
>     >> if (missing(i))
>     >> m[1,]<- c(1, dim(x)[1])
>     >> else
>     >> m[1,]<- range(i)
>     >> 
>     >> if (length(dim(x))>  1) {
>     >> if (missing(j))
>     >> m[2,]<- c(1, dim(x)[2])
>     >> else
>     >> m[2,]<- range(j)
>     >> 
>     >> k<- 3
>     >> while (k<= nrow(m)) {
>     >> if (k-2<= length(e))
>     >> m[k,]<- range(e[[k-2]])
>     >> else
>     >> m[k,]<- c(1, dim(x)[k])
>     >> k<- k + 1
>     >> }
>     >> }
>     >> kall<- match.call()
>     >> d<- functionThatCanOnlyGrabContiguous(x, m, kall)
>     >> 
>     >> kall$x<- d
>     >> if (! missing(i)) {
>     >> kall[[3]]<- i - min(i) + 1
>     >> }
>     >> if (! missing(j)) {
>     >> kall[[4]]<- j - min(j) + 1
>     >> } else {
>     >> if (length(dim(x))>  1)
>     >> kall[[4]]<- seq.int(1, dim(x)[2])
>     >> }
>     >> ## XXX: Have to handle remaining dimensions, but since I can't
>     >> ## really get a clean '...' it is on hold.
>     >> 
>     >> eval(kall)
>     >> })
>     >> 
>     >> ## ############### 1-D
>     >> m<- myExample(10)
>     >> m at x[c(1,5)] == m[c(1, 5)]
>     >> 
>     >> ## ############### 2-D
>     >> m<- myExample(c(10, 10))
>     >> m at x[c(1,5), c(1,5)] == m[c(1,5), c(1,5)]
>     >> m at x[c(5, 2),] == m[c(5,2),]
>     >> 
>     >> ## ############### 3-D
>     >> m<- myExample(c(1,3,4))
>     >> 
>     >> ## (A) doesn't work
>     >> m at x[1,1:2,] == m[1,1:2,]
>     >> 
>     >> ## (B) nor does this for different reasons.
>     >> m[1,,1]
>     >> m at x[1,,1]
>     >> 
>     >> 
>     >>> sessionInfo()
>     >>> 
>     >> R version 2.11.0 (2010-04-22)
>     >> x86_64-pc-linux-gnu
>     >> 
>     >> locale:
>     >> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>     >> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>     >> [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>     >> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>     >> [9] LC_ADDRESS=C               LC_TELEPHONE=C
>     >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>     >> 
>     >> attached base packages:
>     >> [1] stats     graphics  grDevices utils     datasets  methods  
base
>     >> 
>     >> loaded via a namespace (and not attached):
>     >> [1] tools_2.11.0
>     >> 
>     >> ______________________________________________
>     >> R-devel at r-project.org mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-devel
>     >> 
>     >> 
>     >> 
> 
>     > ______________________________________________
>     > R-devel at r-project.org mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-devel