[Bioc-devel] Possible to export coerce2() from S4Vectors?

Bemis, Kylie k@bemi@ @ending from northe@@tern@edu
Wed Nov 14 14:36:19 CET 2018


Hi Herve,

Thanks for the detailed reply. Using as() makes sense. Unfortunately my use case makes it a little more complicated.

The issue comes from a combination of factors:

- My DataFrame subclasses track additional metadata for each row, separate from the typical user-defined columns
- This metadata is checked to decide how to do cbind(...) or if cbind(...) makes sense for those objects
- cbind(...) ends up being called internally by some inherited assignment methods like [[<-
- Coercing to my subclass with as() results in incompatible metadata, causing cbind(...) to fail

I see a few solutions:

1. Using coerce2() works where as() doesn’t, because it takes an example of the “to” object rather than just the class, so compatible metadata can be copied directly from the “to” object, allowing cbind(…) to work as intended.

2. Create an exception to my class logic that allows the metadata to be missing, and change my cbind(…) implementation to ignore the metadata in the case that it is missing.

3. Supply my own version of methods like [[<-. I don’t like this one, since it should be unnecessary.

I can do (2), but I would need to rethink some of my other methods that expect that metadata to exist, so I wanted to check on the plans for coerce2() before making those changes.

What are your thoughts?

Thanks!
Kylie

~~~
Kylie Ariel Bemis
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io<https://kuwisdelu.github.io>





On Nov 13, 2018, at 8:55 PM, Pages, Herve <hpages using fredhutch.org<mailto:hpages using fredhutch.org>> wrote:


Hi Kylie,

I've modified coerce2() in S4Vectors 0.21.5 so that `coerce2(from, to)` should now do the right thing when 'to' is a DataFrame derivative:

  https://github.com/Bioconductor/S4Vectors/commit/48e11dd2c8d474c63e09a69ee7d2d2ec35d7307a<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FBioconductor%2FS4Vectors%2Fcommit%2F48e11dd2c8d474c63e09a69ee7d2d2ec35d7307a&data=02%7C01%7Ck.bemis%40northeastern.edu%7Cd1ed8517bd164aeed6be08d649d441d6%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636777573332495470&sdata=6QqZsJmrVuB1fQ0FcBCvSIZT3Uyt3CBmhlsE7YzZNiw%3D&reserved=0>

With the following gotcha: this will work only if coercion (with as()) from DataFrame to the DataFrame derivative does the right thing. So I'm assuming that this coercion makes sense and can be supported. There are 2 possible situations:

1) The automatic coercion method from DataFrame to your DataFrame derivative (i.e. the coercion method automatically defined by the methods package) does the right thing. In this case coerce2() (and therefore [[<-) will also do the right thing on your DataFrame derivatives. For example:

  library(S4Vectors)
  setClass("MyDataFrameExtension", contains="DataFrame")

  ## WARNING: Don't trust selectMethod() here!
  selectMethod("coerce", c("DataFrame", "MyDataFrameExtension"))
  # Error in selectMethod("coerce", c("DataFrame", "MyDataFrameExtension")) :
  #  no method found for signature DataFrame, MyDataFrameExtension

  as(DataFrame(), "MyDataFrameExtension")
  # MyDataFrameExtension with 0 rows and 0 columns

  ## The automatic coercion method is only created the 1st time it's used!
  ## So now selectMethod() shows it:
  selectMethod("coerce", c("DataFrame", "MyDataFrameExtension"))
  # Method Definition:
  #
  # function (from, to = "MyDataFrameExtension", strict = TRUE)
  # {
  #     obj <- new("MyDataFrameExtension")
  #     as(obj, "DataFrame") <- from
  #     obj
  # }
  # <environment: namespace:S4Vectors>
  #
  # Signatures:
  #         from        to
  # target  "DataFrame" "MyDataFrameExtension"
  # defined "DataFrame" "MyDataFrameExtension"


  MDF <- new("MyDataFrameExtension")
  S4Vectors:::coerce2(list(aa=1:3, bb=21:23), MDF)
  # MyDataFrameExtension with 3 rows and 2 columns
  #          aa        bb
  #   <integer> <integer>
  # 1         1        21
  # 2         2        22
  # 3         3        23


2) The automatic coercion method from DataFrame to your DataFrame derivative doesn't do the right thing (e.g. it returns an invalid object). In this case you need to define this coercion (with a setAs() statement). This will allow coerce2() (and therefore [[<-) to do the right thing on your DataFrame derivatives.

There is no plan at the moment to export coerce2() because this should not be needed. The idea is that developers should not need to define "coerce2" methods but instead make it work via the addition of the appropriate coercion methods. The only purpose of coerce2() is to support things like [[<- and endoapply(). Once coerce2() works properly, these things work out-of-the-box.

So to summarize: just make sure that a DataFrame can be coerced to your DataFrame derivative and [[<- and endoapply() will work out-of-the-box. It could be however that this coercion doesn't make sense and cannot be supported, in which case, we'll need to do something different. Let me know if that's the case.

H.


On 11/13/18 12:45, Bemis, Kylie wrote:

Dear all,

Are there any plans to export coerce2() from the S4Vectors namespace, like other exported internal utilities such as showAsCell() and setListElement()?

I have a couple classes that inherit from DataFrame, and some inherited methods (like [[<-) break in certain situations due to calls to coerce2() that coerce arguments to a regular DataFrame (instead of my subclass). This could be fixed if I were able to implement a coerce2() method for my subclass.

Any suggestions on how to approach problems like this when inheriting from DataFrame and other Vector derivatives?

Many thanks,
Kylie

~~~
Kylie Ariel Bemis
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io<http://kuwisdelu.github.io><https://urldefense.proofpoint.com/v2/url?u=https-3A__kuwisdelu.github.io&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=YnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw&s=HNW2_h6JRKnX1LQZ2SSqiL_QW6jpN_tkMhrFIREkk7Y&e=><https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__kuwisdelu.github.io%26d%3DDwICAg%26c%3DeRAMFD45gAfqt84VtBcfhQ%26r%3DBK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA%26m%3DYnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw%26s%3DHNW2_h6JRKnX1LQZ2SSqiL_QW6jpN_tkMhrFIREkk7Y%26e%3D&data=02%7C01%7Ck.bemis%40northeastern.edu%7Cd1ed8517bd164aeed6be08d649d441d6%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636777573332505479&sdata=uPTxySms1gmD4n6y3msY32Wbk%2FnJ%2FypXQFHxb3bITIQ%3D&reserved=0>






        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel using r-project.org<mailto:Bioc-devel using r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=YnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw&s=dHF-9Xq_n_5IQLQOG3zZ9agK2zTSyNmaRq1M8N29Flc&e=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel%26d%3DDwICAg%26c%3DeRAMFD45gAfqt84VtBcfhQ%26r%3DBK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA%26m%3DYnUxTakT9DhxLeLzGGXceB1HxJFEr0ZVHagTMe0vAWw%26s%3DdHF-9Xq_n_5IQLQOG3zZ9agK2zTSyNmaRq1M8N29Flc%26e%3D&data=02%7C01%7Ck.bemis%40northeastern.edu%7Cd1ed8517bd164aeed6be08d649d441d6%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636777573332515491&sdata=Rjm40EluXANLI2LAdgoGr8Xxi%2FfvvcbWU2cwBuhl7zU%3D&reserved=0>


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages using fredhutch.org<mailto:hpages using fredhutch.org>
Phone:  (206) 667-5791
Fax:    (206) 667-1319



	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list