[Bioc-devel] Possible to export coerce2() from S4Vectors?

Michael Lawrence l@wrence@mich@el @ending from gene@com
Wed Nov 14 18:38:11 CET 2018


The use of c() in the implementation of [[<- is problematic, since [[<- has
the semantic of insertion, preserving the overall structure of x, while c()
is a combination of two or more peer data structures, and it is difficult
to define the correct logic through dispatch.

The dispatch on ... is not well documented. I will try to improve that, as
soon as I understand it myself. But no matter what, your cbind() method
will need to uplift ordinary DataFrames to IndexedDataFrame.

Michael

On Wed, Nov 14, 2018 at 7:52 AM Bemis, Kylie <k.bemis using northeastern.edu>
wrote:

> Hi Michael,
>
> Here is a simple example of what I’m trying to do:
>
> setClass("IndexedDataFrame",
> contains="DataFrame",
> slots=c(ids="numeric"))
>
> # track additional ID metadata w/ special rules
> IndexedDataFrame <- function(ids, ...) {
> x <- DataFrame(...)
> new("IndexedDataFrame",
> ids=ids,
> rownames=rownames(x),
> nrows=nrow(x),
> listData=x using listData,
> elementMetadata=mcols(x))
> }
>
> # check for matching IDs before cbind-ing
> setMethod("cbind", "IndexedDataFrame",
> function(...) {
> args <- list(...)
> ids <- args[[1L]]@ids
> ok <- vapply(args, function(a) {
> # check for compatible IDs
> identical(a using ids, ids)
> }, logical(1))
> if ( !all(ok) )
> stop("ids must match")
> x <- callNextMethod(...)
> new(class(args[[1L]]),
> ids=ids,
> rownames=rownames(x),
> nrows=nrow(x),
> listData=x using listData,
> elementMetadata=mcols(x))
> })
>
> set.seed(1)
> idf <- IndexedDataFrame(ids=runif(10), a=1:10, b=11:20)
> idf$c <- 21:30
>
> Error in identical(a using ids, ids) :
>   no slot of name "ids" for this object of class "DataFrame"
> In addition: Warning message:
> In methods:::.selectDotsMethod(classes, .MTable, .AllMTable) :
>   multiple direct matches: "IndexedDataFrame", "DataFrame"; using the
> first of these
>
> Specific examples where I use this pattern are new MassDataFrame and
> PositionDataFrame classes in Cardinal, which require associated m/z-values
> and pixel coordinates as additional metadata. Current source code is here:
>
>
> https://github.com/kuwisdelu/Cardinal/blob/master/R/methods2-MassDataFrame.R
>
> https://github.com/kuwisdelu/Cardinal/blob/master/R/methods2-PositionDataFrame.R
>
> In older versions of Cardinal, similar versions of these classes extended
> AnnotatedDataFrame and used regular columns for this metadata, while
> requiring those columns to follow a specific naming scheme. This proved
> fragile, difficult to maintain, and easily broken, so I am now using slots
> to contain this metadata so they can be validated independently of whatever
> user-supplied columns exist.
>
> Kylie
>
> ~~~
> Kylie Ariel Bemis
> College of Computer and Information Science
> Northeastern University
> kuwisdelu.github.io
>
>
>
>
>
> On Nov 14, 2018, at 10:12 AM, Michael Lawrence <lawrence.michael using gene.com>
> wrote:
>
> I don't want to derail this thread, but why is coerce2() necessary? Would
> it be possible to fold its logic into as() without breaking too much?
>
> Kylie,
>
> It would help to see your code, with some pointers to where things break.
>
> Michael
>
> On Wed, Nov 14, 2018 at 5:36 AM Bemis, Kylie <k.bemis using northeastern.edu>
> wrote:
>
>> Hi Herve,
>>
>> Thanks for the detailed reply. Using as() makes sense. Unfortunately my
>> use case makes it a little more complicated.
>>
>> The issue comes from a combination of factors:
>>
>> - My DataFrame subclasses track additional metadata for each row,
>> separate from the typical user-defined columns
>> - This metadata is checked to decide how to do cbind(...) or if
>> cbind(...) makes sense for those objects
>> - cbind(...) ends up being called internally by some inherited assignment
>> methods like [[<-
>> - Coercing to my subclass with as() results in incompatible metadata,
>> causing cbind(...) to fail
>>
>> I see a few solutions:
>>
>> 1. Using coerce2() works where as() doesn’t, because it takes an example
>> of the “to” object rather than just the class, so compatible metadata can
>> be copied directly from the “to” object, allowing cbind(…) to work as
>> intended.
>>
>> 2. Create an exception to my class logic that allows the metadata to be
>> missing, and change my cbind(…) implementation to ignore the metadata in
>> the case that it is missing.
>>
>> 3. Supply my own version of methods like [[<-. I don’t like this one,
>> since it should be unnecessary.
>>
>> I can do (2), but I would need to rethink some of my other methods that
>> expect that metadata to exist, so I wanted to check on the plans for
>> coerce2() before making those changes.
>>
>> What are your thoughts?
>>
>> Thanks!
>> Kylie
>>
>> ~~~
>> Kylie Ariel Bemis
>> College of Computer and Information Science
>> Northeastern University
>> kuwisdelu.github.io
>> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747276754&sdata=vXC5ppIu2%2BqCgS1fs1UD2Say6y0zIDNwHRDfX1sKA1w%3D&reserved=0>
>> <https://kuwisdelu.github.io
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747276754&sdata=hO3dFZvT5OqGsgsmLsPAWQktRf%2BynDtCeSQ2RA8h2tw%3D&reserved=0>
>> >
>>
>>
>>
>>
>>
>> On Nov 13, 2018, at 8:55 PM, Pages, Herve <hpages using fredhutch.org<mailto:
>> hpages using fredhutch.org>> wrote:
>>
>>
>> Hi Kylie,
>>
>> I've modified coerce2() in S4Vectors 0.21.5 so that `coerce2(from, to)`
>> should now do the right thing when 'to' is a DataFrame derivative:
>>
>>
>> https://github.com/Bioconductor/S4Vectors/commit/48e11dd2c8d474c63e09a69ee7d2d2ec35d7307a
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FBioconductor%2FS4Vectors%2Fcommit%2F48e11dd2c8d474c63e09a69ee7d2d2ec35d7307a&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747286758&sdata=29sU9F2USBZczt%2Fqy%2FUQcr0Uiw3Bhg2GP11m3tdbILA%3D&reserved=0>
>> <
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FBioconductor%2FS4Vectors%2Fcommit%2F48e11dd2c8d474c63e09a69ee7d2d2ec35d7307a&data=02%7C01%7Ck.bemis%40northeastern.edu%7Cd1ed8517bd164aeed6be08d649d441d6%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636777573332495470&sdata=6QqZsJmrVuB1fQ0FcBCvSIZT3Uyt3CBmhlsE7YzZNiw%3D&reserved=0
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FBioconductor%2FS4Vectors%2Fcommit%2F48e11dd2c8d474c63e09a69ee7d2d2ec35d7307a&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747296766&sdata=cJ%2B%2BGBOoQunTII9%2BUtMLd%2BvJrcwSnXneLc1XGcdJAs8%3D&reserved=0>
>> >
>>
>> With the following gotcha: this will work only if coercion (with as())
>> from DataFrame to the DataFrame derivative does the right thing. So I'm
>> assuming that this coercion makes sense and can be supported. There are 2
>> possible situations:
>>
>> 1) The automatic coercion method from DataFrame to your DataFrame
>> derivative (i.e. the coercion method automatically defined by the methods
>> package) does the right thing. In this case coerce2() (and therefore [[<-)
>> will also do the right thing on your DataFrame derivatives. For example:
>>
>>   library(S4Vectors)
>>   setClass("MyDataFrameExtension", contains="DataFrame")
>>
>>   ## WARNING: Don't trust selectMethod() here!
>>   selectMethod("coerce", c("DataFrame", "MyDataFrameExtension"))
>>   # Error in selectMethod("coerce", c("DataFrame",
>> "MyDataFrameExtension")) :
>>   #  no method found for signature DataFrame, MyDataFrameExtension
>>
>>   as(DataFrame(), "MyDataFrameExtension")
>>   # MyDataFrameExtension with 0 rows and 0 columns
>>
>>   ## The automatic coercion method is only created the 1st time it's used!
>>   ## So now selectMethod() shows it:
>>   selectMethod("coerce", c("DataFrame", "MyDataFrameExtension"))
>>   # Method Definition:
>>   #
>>   # function (from, to = "MyDataFrameExtension", strict = TRUE)
>>   # {
>>   #     obj <- new("MyDataFrameExtension")
>>   #     as(obj, "DataFrame") <- from
>>   #     obj
>>   # }
>>   # <environment: namespace:S4Vectors>
>>   #
>>   # Signatures:
>>   #         from        to
>>   # target  "DataFrame" "MyDataFrameExtension"
>>   # defined "DataFrame" "MyDataFrameExtension"
>>
>>
>>   MDF <- new("MyDataFrameExtension")
>>   S4Vectors:::coerce2(list(aa=1:3, bb=21:23), MDF)
>>   # MyDataFrameExtension with 3 rows and 2 columns
>>   #          aa        bb
>>   #   <integer> <integer>
>>   # 1         1        21
>>   # 2         2        22
>>   # 3         3        23
>>
>>
>> 2) The automatic coercion method from DataFrame to your DataFrame
>> derivative doesn't do the right thing (e.g. it returns an invalid object).
>> In this case you need to define this coercion (with a setAs() statement).
>> This will allow coerce2() (and therefore [[<-) to do the right thing on
>> your DataFrame derivatives.
>>
>> There is no plan at the moment to export coerce2() because this should
>> not be needed. The idea is that developers should not need to define
>> "coerce2" methods but instead make it work via the addition of the
>> appropriate coercion methods. The only purpose of coerce2() is to support
>> things like [[<- and endoapply(). Once coerce2() works properly, these
>> things work out-of-the-box.
>>
>> So to summarize: just make sure that a DataFrame can be coerced to your
>> DataFrame derivative and [[<- and endoapply() will work out-of-the-box. It
>> could be however that this coercion doesn't make sense and cannot be
>> supported, in which case, we'll need to do something different. Let me know
>> if that's the case.
>>
>> H.
>>
>>
>> On 11/13/18 12:45, Bemis, Kylie wrote:
>>
>> Dear all,
>>
>> Are there any plans to export coerce2() from the S4Vectors namespace,
>> like other exported internal utilities such as showAsCell() and
>> setListElement()?
>>
>> I have a couple classes that inherit from DataFrame, and some inherited
>> methods (like [[<-) break in certain situations due to calls to coerce2()
>> that coerce arguments to a regular DataFrame (instead of my subclass). This
>> could be fixed if I were able to implement a coerce2() method for my
>> subclass.
>>
>> Any suggestions on how to approach problems like this when inheriting
>> from DataFrame and other Vector derivatives?
>>
>> Many thanks,
>> Kylie
>>
>> ~~~
>> Kylie Ariel Bemis
>> College of Computer and Information Science
>> Northeastern University
>> kuwisdelu.github.io
>> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747296766&sdata=5X1s4qC31GF5CN%2F0s9nNNJDC8ZjCN3aGNQULg1HTxN4%3D&reserved=0>
>> <http://kuwisdelu.github.io
>> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747306778&sdata=3T7S4nDUu3XfADqU46y%2Fl0dLleAVPYzeEQQRs5KG0AM%3D&reserved=0>
>> ><
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__kuwisdelu.github.io&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=YnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw&s=HNW2_h6JRKnX1LQZ2SSqiL_QW6jpN_tkMhrFIREkk7Y&e=
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__kuwisdelu.github.io%26d%3DDwICAg%26c%3DeRAMFD45gAfqt84VtBcfhQ%26r%3DBK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA%26m%3DYnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw%26s%3DHNW2_h6JRKnX1LQZ2SSqiL_QW6jpN_tkMhrFIREkk7Y%26e%3D&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747306778&sdata=mJQdeMU3aJ6lznLwlf40ZfZQ7M51kGDzEHPpzlPDHds%3D&reserved=0>
>> ><
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__kuwisdelu.github.io%26d%3DDwICAg%26c%3DeRAMFD45gAfqt84VtBcfhQ%26r%3DBK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA%26m%3DYnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw%26s%3DHNW2_h6JRKnX1LQZ2SSqiL_QW6jpN_tkMhrFIREkk7Y%26e%3D&data=02%7C01%7Ck.bemis%40northeastern.edu%7Cd1ed8517bd164aeed6be08d649d441d6%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636777573332505479&sdata=uPTxySms1gmD4n6y3msY32Wbk%2FnJ%2FypXQFHxb3bITIQ%3D&reserved=0
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__kuwisdelu.github.io%26d%3DDwICAg%26c%3DeRAMFD45gAfqt84VtBcfhQ%26r%3DBK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA%26m%3DYnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw%26s%3DHNW2_h6JRKnX1LQZ2SSqiL_QW6jpN_tkMhrFIREkk7Y%26e%3D&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747316786&sdata=yC8A5%2F5JfKWK7gMrUMUu1W89ZLQkFiM10Ci69LerfaM%3D&reserved=0>
>> >
>>
>>
>>
>>
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel using r-project.org<mailto:Bioc-devel using r-project.org> mailing list
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=YnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw&s=dHF-9Xq_n_5IQLQOG3zZ9agK2zTSyNmaRq1M8N29Flc&e=
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel%26d%3DDwICAg%26c%3DeRAMFD45gAfqt84VtBcfhQ%26r%3DBK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA%26m%3DYnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw%26s%3DdHF-9Xq_n_5IQLQOG3zZ9agK2zTSyNmaRq1M8N29Flc%26e%3D&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747326790&sdata=oGwq8S%2BTdi4wesyK%2Fytc01GOd5MQSDQTj8lipbph6S0%3D&reserved=0>
>> <
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel%26d%3DDwICAg%26c%3DeRAMFD45gAfqt84VtBcfhQ%26r%3DBK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA%26m%3DYnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw%26s%3DdHF-9Xq_n_5IQLQOG3zZ9agK2zTSyNmaRq1M8N29Flc%26e%3D&data=02%7C01%7Ck.bemis%40northeastern.edu%7Cd1ed8517bd164aeed6be08d649d441d6%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636777573332515491&sdata=Rjm40EluXANLI2LAdgoGr8Xxi%2FfvvcbWU2cwBuhl7zU%3D&reserved=0
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel%26d%3DDwICAg%26c%3DeRAMFD45gAfqt84VtBcfhQ%26r%3DBK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA%26m%3DYnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw%26s%3DdHF-9Xq_n_5IQLQOG3zZ9agK2zTSyNmaRq1M8N29Flc%26e%3D&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747336798&sdata=ml3E%2B1%2F%2FrVviXhdj1KvH%2F3szuBYlG3%2BInzAbPKjDf7o%3D&reserved=0>
>> >
>>
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages using fredhutch.org<mailto:hpages using fredhutch.org>
>> Phone:  (206) 667-5791
>> Fax:    (206) 667-1319
>>
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747336798&sdata=PViUANnMf6C8Ibd9droxKynszdh0W8HyUxqGaauazxQ%3D&reserved=0>
>
>
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list