[Bioc-devel] Virtual class for `matrix` and `DelayedArray`? (or better strategy for dealing with them both)

Hervé Pagès hpages at fredhutch.org
Mon Apr 30 19:58:45 CEST 2018


Just to mention that the issue with rowSums() on a big DelayedMatrix
objects that you are referring to Tim is duly noted and will be one
of my first priorities once we're done with the release process.

Cheers,
H.

On 04/30/2018 09:57 AM, Tim Triche, Jr. wrote:
> much obliged -- and the packages are terrific, I am not surprised that a
> big step is accompanied with some growing pains.
> Thanks to you and Herve and Keegan for enthusiastically chasing down, and
> spending your time fixing, this and other bugs.
> Having fiddled with bigMemory and bigMatrix backends for
> SummarizedExperiment over the years, I know how many kicks and stings exist
> under the covers, and greatly appreciate your (plural) efforts.
> 
> --t
> 
> On Mon, Apr 30, 2018 at 12:30 PM, Peter Hickey <peter.hickey at gmail.com>
> wrote:
> 
>> Tim: As the developer of DelayedMatrixStats (and enthusiastic 'canary down
>> the coal mine' user-dev of DelayedArray) I'm obviously invested in reducing
>> the confusion around these packages
>>
>> I'm going to write some blog posts-cum-vignettes-cum-F1000 around these
>> issues over the coming weeks, with the ultimate goal of improving the
>> packages themselves.
>>
>> Pete
>>
>>
>>
>> On Mon., 30 Apr. 2018, 12:11 pm Tim Triche, Jr., <tim.triche at gmail.com>
>> wrote:
>>
>>> But if you merge methods like that, the error method can be that much more
>>> difficult to identify. It took a couple of weeks to chase that bug down
>>> properly, and it ended up down to rowMeans2 vs rowMeans.
>>>
>>> I suppose the merged/abstracted method allows to centralize any such
>>> dispatch into one place and swap out ill-behaved methods once identified,
>>> so as long as DelayedArray/DelayedMatrixStats quirks are
>>> documented/understood, maybe it is better to create this union class?
>>>
>>> The Matrix/matrixStats/DelayedMatrix/DelayedMatrixStats situation has
>>> been
>>> "interesting" in practical terms, as seemingly simple abstractions appear
>>> to require more thought. That was my only point.
>>>
>>>
>>> --t
>>>
>>> On Mon, Apr 30, 2018 at 11:28 AM, Martin Morgan <
>>> martin.morgan at roswellpark.org> wrote:
>>>
>>>> But that issue will be fixed, so Tim's advice is inappropriate.
>>>>
>>>>
>>>> On 04/30/2018 10:42 AM, Tim Triche, Jr. wrote:
>>>>
>>>>> Don't do that.  Seriously, just don't.
>>>>>
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_DelayedArray_issues_16&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=WCuDvGWmDrT5ZoYylftzjbrlaEu-lOxIIJaNJgn6itQ&s=kC-1EPn1Q-IuXzwk3-l-pJNp8zu4tXBnAP-1VPWKlbs&e=
>>>>>
>>>>> --t
>>>>>
>>>>> On Mon, Apr 30, 2018 at 10:02 AM, Elizabeth Purdom <
>>>>> epurdom at stat.berkeley.edu> wrote:
>>>>>
>>>>> Hello,
>>>>>>
>>>>>> I am trying to extend my package to handle `HDF5Matrix` class ( or
>>> more
>>>>>> generally `DelayedArray`). I currently have S4 functions for `matrix`
>>>>>> class. Usually I have a method for `SummarizedExperiment`, which will
>>>>>> call
>>>>>> call the method on `assay(x)` and I want the method to be able to deal
>>>>>> with
>>>>>> if `assay(x)` is a `DelayedArray`.
>>>>>>
>>>>>> Most of my functions, however, do not require separate code depending
>>> on
>>>>>> whether `x` is a `matrix` or `DelayedArray`. They are making use of
>>>>>> existing functions that will make that choice for me, e.g. rowMeans or
>>>>>> subsetting. My goal right now is compatibility, not cleverness, and
>>> I'm
>>>>>> not
>>>>>> creating HDF5 methods to handle other cases. (If something doesn't
>>>>>> currently exist, then I just enclose `x` with `data.matrix` or
>>>>>> `as.matrix`
>>>>>> and call the matrix into memory — for cleanliness and ease in updating
>>>>>> with
>>>>>> appropriate methods in future, I could make separate S4 functions for
>>>>>> these
>>>>>> specific tasks to dispatch, but that's outside of the scope of my
>>>>>> question). So for simplicity assume I don't really need to dispatch
>>> *my
>>>>>> code* -- that the methods I'm going to use do that.
>>>>>>
>>>>>> The natural solution for me seem to use `setClassUnion` and I was
>>>>>> wondering if such a virtual class already exists? Or is there a better
>>>>>> way
>>>>>> to handle this?
>>>>>>
>>>>>> Here's a simple example, using `rowMeans` as my example:
>>>>>>
>>>>>> ```
>>>>>> setGeneric("myNewRowMeans", function(x,...) { standardGeneric("
>>>>>> myNewRowMeans")})
>>>>>> setClassUnion("matrixOrDelayed",members=c("matrix", "DelayedArray"))
>>>>>>
>>>>>> #' @importFrom DelayedArray rowMeans
>>>>>> setMethod("myNewRowMeans",
>>>>>>             signature = "matrixOrDelayed",
>>>>>>             definition = function(x,...){
>>>>>>                           # a lot of code independent of x
>>>>>>                           print("This is a lot of code shared
>>> regardless
>>>>>> of
>>>>>> class of x\n")
>>>>>>                           # a lot of code that depends on x, but is
>>>>>> dispatched by the functions called
>>>>>>                           out<-rowMeans(x)
>>>>>>                           #a lot of code based on output of out
>>>>>>                           out<-out+1
>>>>>>                           return(out)
>>>>>>                   }
>>>>>> )
>>>>>> ```
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioc-devel at r-project.org mailing list
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=WCuDvGWmDrT5ZoYylftzjbrlaEu-lOxIIJaNJgn6itQ&s=_3ZIrKjXNYWYMKKDBvbn1aNtGMB6rfqfhs-zU_P5_ug&e=
>>>>>>
>>>>>>
>>>>>          [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> Bioc-devel at r-project.org mailing list
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=WCuDvGWmDrT5ZoYylftzjbrlaEu-lOxIIJaNJgn6itQ&s=_3ZIrKjXNYWYMKKDBvbn1aNtGMB6rfqfhs-zU_P5_ug&e=
>>>>>
>>>>>
>>>>
>>>> This email message may contain legally privileged and/or confidential
>>>> information.  If you are not the intended recipient(s), or the employee
>>> or
>>>> agent responsible for the delivery of this message to the intended
>>>> recipient(s), you are hereby notified that any disclosure, copying,
>>>> distribution, or use of this email message is prohibited.  If you have
>>>> received this message in error, please notify the sender immediately by
>>>> e-mail and delete this email message from your computer. Thank you.
>>>>
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=WCuDvGWmDrT5ZoYylftzjbrlaEu-lOxIIJaNJgn6itQ&s=_3ZIrKjXNYWYMKKDBvbn1aNtGMB6rfqfhs-zU_P5_ug&e=
>>>
>>
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=WCuDvGWmDrT5ZoYylftzjbrlaEu-lOxIIJaNJgn6itQ&s=_3ZIrKjXNYWYMKKDBvbn1aNtGMB6rfqfhs-zU_P5_ug&e=
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list