[Bioc-devel] Overloading subset operator for an S4 object with more than two dimensions

Christian Arnold christian.arnold at embl.de
Mon May 18 15:06:33 CEST 2015


Thanks for your input, highly appreciated!

I can see that the semantics of "[" are violated, so I agree that 
overwriting the "subset" method  is probably a better way to go. 
Essentially, the object stores several, individual-specific count 
matrices from RNA-Seq experiments in an potentially allele(read 
group)-specific manner. So the dimensions to subset on are the read 
groups, the rows and columns of the matrices, and the individuals itself.

So I guess overloading the subset method with four arguments, each 
corresponding to one of the dimensions a subset is suitable for this 
kind of object, is the way to go.

Thanks,
Christian


On 14.05.2015 15:57, Michael Lawrence wrote:
> I agree with Wolfgang that the semantics of [ are being violated here. 
> It would though help if you could be a little less vague about your 
> intent. What is this data structure going to store, how should it behave?
>
> On Thu, May 14, 2015 at 3:35 AM, Christian Arnold 
> <christian.arnold at embl.de <mailto:christian.arnold at embl.de>> wrote:
>
>     Hi there,
>
>     I am about to develop a Bioconductor package that implements a
>     custom S4 object, and I am currently thinking about a few issues,
>     including the following:
>
>     Say we have an S4 object that stores a lot of information in
>     different slots. Assume that it does make sense to extract
>     information out of this object in four different "dimensions"
>     (conceptually similar to a four-dimensional object), so one would
>     like to use the subset "[" operator for this, but extending beyond
>     the "typical" one or two dimensions to 4:
>
>     setClass("A",
>     representation=representation(a="numeric",b="numeric",c="numeric",d="numeric"))
>     a = new("A", a=1:5,b=1:5,c=1:5,d=1:5)
>
>     Now it would be nice to do stuff like a[1,2,3:4,5], which should
>     simply return the selected elements in slots a, b, c, and d,
>     respectively. So a[1,2,3:4,5] would return:
>
>     An object of class "A"
>     Slot "a":
>     [1] 1
>
>     Slot "b":
>     [1] 2
>
>     Slot "c":
>     [1] 3 4
>
>     Slot "d":
>     [1] 5
>
>     This is how far I've come:
>
>     setMethod("[", c("A", "ANY", "ANY","ANY"),
>               function(x, i, j, ..., drop=TRUE)
>               {
>                 dots <- list(...)
>                 if (length(dots) > 2) {
>                   stop("Too many arguments, must be four dimensional")
>                 }
>
>                 # Parse the extra two dimensions that we need from the
>     ... argument
>                 k = ifelse(length(dots) > 0 , dots[[1]], c(1:5))
>                 l = ifelse(length(dots) == 2, dots[[2]], c(1:5))
>
>                 initialize(x, a=x at a[i],b=x at b[j],c=x at c[k],d=x at d[l])
>               })
>
>     This works for stuff like a[1,2,3, 4], but fails with a general
>     error if one of the indices is a vector such as a[1:2,2,3, 4] or
>     a[1,2,3,4:5].
>
>
>     So, in summary, my questions are:
>     1. Is there a reasonable way of achieving the 4-dimensional
>     subsetting that works as a user would expect it to work?
>     2. Does it make more sense to write a custom function instead to
>     achieve this, such as subsetObject() without overloading "["
>     explicitly? What are the Bioconductor recommendations here?
>
>     I'd appreciate any help, suggestions, etc!
>
>     Thanks,
>     Christian
>
>     _______________________________________________
>     Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org> mailing
>     list
>     https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>

-- 
—————————————————————————
Christian Arnold, PhD
Staff Bioinformatician

SCB Unit - Computational Biology
Joint appointment Genome Biology
Joint appointment European Bioinformatics Institute (EMBL-EBI)

European Molecular Biology Laboratory (EMBL)
Meyerhofstrasse 1; 69117, Heidelberg, Germany

Email: christian.arnold at embl.de
Phone: +49(0)6221-387-8472
Web: http://www.embl.de/research/units/scb/zaugg/


	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list