[Bioc-devel] Import BSgenome class without attaching BiocGenerics (and others)?

Bhagwat, Aditya Ad|ty@@Bh@gw@t @end|ng |rom mp|-bn@mpg@de
Thu Sep 12 10:47:09 CEST 2019


Thanks Michael and Herve, 

Will do that then. 

I extract from this discussion that exporting a function in a core BioC package is reserved for functions 
(1) whose name unambiguously communicates what they do
(2) has the potential to be broadly used

And that as(BSgenome, 'GRanges') is being felt not not comply to these.

Thanks for all  the feedback - has been very helpful.

Aditya

________________________________________
From: Pages, Herve [hpages using fredhutch.org]
Sent: Wednesday, September 11, 2019 5:29 PM
To: Michael Lawrence; Bhagwat, Aditya
Cc: bioc-devel using r-project.org
Subject: Re: [Bioc-devel] Import BSgenome class without attaching BiocGenerics (and others)?

Or more accurately:

   as(seqinfo(bsgenome)[seqlevelsInUse(grl)], "GRanges")

since not all seqlevels are necessarily "in use" (i.e. not necessarily
represented in seqnames(grl)).

H.

On 9/11/19 08:26, Hervé Pagès wrote:
> The unique seqnames is what we call the seqlevels. So just:
>
>    as(seqinfo(bsgenome)[seqlevels(grl)], "GRanges")
>
> H.
>
> On 9/11/19 07:42, Michael Lawrence wrote:
>> So why not just do:
>>
>> as(seqinfo(bsgenome)[unique(unlist(seqnames(grl)))], "GRanges")
>>
>> Michael
>>
>> On Wed, Sep 11, 2019 at 5:55 AM Bhagwat, Aditya
>> <Aditya.Bhagwat using mpi-bn.mpg.de> wrote:
>>>
>>> Thanks Michael,
>>>
>>> The important detail is that I want to plot the relevant chromosomes
>>> only
>>>
>>>      relevant_chromosomes <- GenomeInfoDb::seqnames(grangeslist)  %>%
>>>                              S4Vectors::runValue() %>%
>>>                              Reduce(union, .) %>%
>>>                              unique()
>>>
>>>      genomeranges <- GenomeInfoDb::seqinfo(grangeslist) %>%
>>>                      as('GRanges') %>%
>>>                     (function(gr){
>>>                         gr [ as.character(GenomeInfoDb::seqnames(gr))
>>> %in%
>>>                              relevant_chromosomes ]
>>>                     })
>>>
>>>      kp <- karyoploteR::plotKaryotype(genomeranges)
>>>      karyoploteR::kpPlotRegions(kp, grangeslist) # grangeslist
>>> contains crispr target sites
>>>
>>>
>>> And, this process required as("GRanges")
>>>
>>>      #' Convert BSgenome into GRanges
>>>      #' @param from BSgenome, e.g.
>>> BSgenome.Mmusculus.UCSC.mm10::Mmusculus
>>>      #' @examples
>>>      #' require(magrittr)
>>>      #' BSgenome.Mmusculus.UCSC.mm10::BSgenome.Mmusculus.UCSC.mm10 %>%
>>>      #' as('GRanges')
>>>      #' @importClassesFrom BSgenome BSgenome
>>>      #' @export
>>>      methods::setAs( "BSgenome",
>>>                      "GRanges",
>>>                      function(from)  from %>%
>>>                                      GenomeInfoDb::seqinfo() %>%
>>>                                      as('GRanges'))
>>>
>>> Thankyou for feedback,
>>>
>>> Aditya
>>>
>>> ________________________________________
>>> From: Michael Lawrence [lawrence.michael using gene.com]
>>> Sent: Wednesday, September 11, 2019 2:31 PM
>>> To: Bhagwat, Aditya
>>> Cc: Pages, Herve; bioc-devel using r-project.org
>>> Subject: Re: [Bioc-devel] Import BSgenome class without attaching
>>> BiocGenerics (and others)?
>>>
>>> I'm pretty surprised that the karyoploteR package does not accept a
>>> Seqinfo since it is plotting chromosomes. But again, please consider
>>> just doing as(seqinfo(bsgenome), "GRanges").
>>>
>>> On Wed, Sep 11, 2019 at 3:59 AM Bhagwat, Aditya
>>> <Aditya.Bhagwat using mpi-bn.mpg.de> wrote:
>>>>
>>>> Hi Herve,
>>>>
>>>> Thank you for your responses.
>>>>  From your response, it is clear that the vcountPDict use case does
>>>> not need a BSgenome -> GRanges coercer.
>>>>
>>>> The karyoploteR use case still requires it, though, to allow
>>>> plotting of only the chromosomal BSgenome portions:
>>>>
>>>>      chromranges <- as(bsegenome, "GRanges")
>>>>      kp <- karyoploteR::plotKaryotype(chromranges)
>>>>      karyoploteR::kpPlotRegions(kp, crispr_target_sites)
>>>>
>>>> Or do you see any alternative for this purpose too?
>>>>
>>>> Aditya
>>>>
>>>> ________________________________________
>>>> From: Pages, Herve [hpages using fredhutch.org]
>>>> Sent: Wednesday, September 11, 2019 12:24 PM
>>>> To: Bhagwat, Aditya; bioc-devel using r-project.org
>>>> Subject: Re: [Bioc-devel] Import BSgenome class without attaching
>>>> BiocGenerics (and others)?
>>>>
>>>> Hi Aditya,
>>>>
>>>> On 9/11/19 01:31, Bhagwat, Aditya wrote:
>>>>> Hi Herve,
>>>>>
>>>>>
>>>>>   > It feels that a coercion method from BSgenome to GRanges should
>>>>> rather be defined in the BSgenome package itself.
>>>>>
>>>>> :-)
>>>>>
>>>>>
>>>>>   > Patch/PR welcome on GitHub.
>>>>>
>>>>> Owkies. What pull/fork/check/branch protocol to be followed?
>>>>>
>>>>>
>>>>>   > Is this what you have in mind for this coercion?
>>>>>   > as(seqinfo(BSgenome.Celegans.UCSC.ce10), "GRanges")
>>>>>
>>>>> Yes.
>>>>>
>>>>> Perhaps also useful to share the wider context, allowing your and
>>>>> others
>>>>> feedback for improved software design.
>>>>> I wanted to subset a
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124367&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=xNa-6ZKTD1MnnfT55tntHjdK51Y1JQGQxTlzX2-OYmI&e=>BSgenome
>>>>>
>>>>> (without the _random or _unassigned), but Lori explained this is not
>>>>> possible.
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124367&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=xNa-6ZKTD1MnnfT55tntHjdK51Y1JQGQxTlzX2-OYmI&e=>
>>>>>
>>>>>
>>>>> Instead Lori suggested to coerce a BSgenome into a GRanges
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_123489&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=6Eh73QthFfpPsfpRdPWs98pH6GHvv1Z23ORp34OCPxA&e=>,
>>>>>
>>>>> which is a useful solution, but for which currently no exported S4
>>>>> method exists
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124416&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=H8owJlOQrNHwNFHfCxGHe27Jxu6xjxpuAMWK8JlTU4Y&e=>
>>>>>
>>>>> So I defined an S4 coercer in my multicrispr package, making sure to
>>>>> properly import the Bsgenome class
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124442&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=2XNBVcwoJTjlxY_gl4UPzrHPKmKH9LTnM4ih5SQOfps&e=>.
>>>>>
>>>>> Then, after coercing a BSgenome into a GRanges, I can extract the
>>>>> chromosomes, after properly importing IRanges::`%in%`
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124367&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=xNa-6ZKTD1MnnfT55tntHjdK51Y1JQGQxTlzX2-OYmI&e=>
>>>>>
>>>>
>>>> Looks like you don't need to coerce the BSgenome object to GRanges. See
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_123489_-23124581&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ca5pCXdCF2WpOOFAZlUeJVWFkiNt6X-kiDslxFP5AwM&s=kwrPa77YhkAln44Cs7s5Egh_qr247FIVfcEYm52QOcI&e=
>>>>
>>>>
>>>> H.
>>>>
>>>>> Which I can then on end to karyoploteR
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124328&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=M90_rBO1oohGnXe2XBpQHQriFNthY_W0hzN6KWlf2S4&e=>,
>>>>>
>>>>> for genome-wide plots of crispr target sites.
>>>>>
>>>>> A good moment also to say thank you to all of you who helped me
>>>>> out, it
>>>>> helps me to make multicrispr fit nicely into the BioC ecosystem.
>>>>>
>>>>> Speeking of BioC design philosophy, can any of you suggest concise and
>>>>> to-the-point reading material to deepen my understanding of the core
>>>>> BioC software design philosophy?
>>>>> I am trying to understand that better (which was the context for
>>>>> asking
>>>>> recently why there are three Vector -> data.frame coercers in
>>>>> S4Vectors
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124491&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=nBHdQoTrd1Mfu4VTMgtkPyUQ0Ju2NLeX-0X1Ny3fSeg&e=>)
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Aditya
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ________________________________________
>>>>> From: Pages, Herve [hpages using fredhutch.org]
>>>>> Sent: Tuesday, September 10, 2019 6:45 PM
>>>>> To: Bhagwat, Aditya; bioc-devel using r-project.org
>>>>> Subject: Re: [Bioc-devel] Import BSgenome class without attaching
>>>>> BiocGenerics (and others)?
>>>>>
>>>>> Hi Aditya,
>>>>>
>>>>>
>>>>> More generally speaking, coercion methods should be defined in a place
>>>>> that is "as close as possible" to the "from" or "to" classes rather
>>>>> than
>>>>> in a package that doesn't own any of the 2 classes involved.
>>>>> Is this what you have in mind for this coercion?
>>>>>
>>>>>   > as(seqinfo(BSgenome.Celegans.UCSC.ce10), "GRanges")
>>>>> GRanges object with 7 ranges and 0 metadata columns:
>>>>> seqnames ranges strand
>>>>> <Rle> <IRanges> <Rle>
>>>>> chrI chrI 1-15072423 *
>>>>> chrII chrII 1-15279345 *
>>>>> chrIII chrIII 1-13783700 *
>>>>> chrIV chrIV 1-17493793 *
>>>>> chrV chrV 1-20924149 *
>>>>> chrX chrX 1-17718866 *
>>>>> chrM chrM 1-13794 *
>>>>> -------
>>>>> seqinfo: 7 sequences (1 circular) from ce10 genome
>>>>>
>>>>> Thanks,
>>>>> H.
>>>>>
>>>>>
>>>>> On 9/6/19 03:39, Bhagwat, Aditya wrote:
>>>>>   > Dear Bioc devel,
>>>>>   >
>>>>>   > Is it possible to import the BSgenome class without attaching
>>>>> BiocGenerics (to keep a clean namespace during the development of
>>>>> multicrispr<https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.gwdg.de_loosolab_software_multicrispr&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cXJaaEvfNbOioopXgFWQms1qny1xehFQyb3V3xDy55M&s=MIR-kUeXy9oWokdQxItuG82hrvs0uwP1aBIqNdM-Jrs&e=
>>>>>
>>>>>   >).
>>>>>   >
>>>>>   > BSgenome <- methods::getClassDef('BSgenome', package = 'BSgenome')
>>>>>   >
>>>>>   > (Posted earlier on BioC
>>>>> support<https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124442_&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cXJaaEvfNbOioopXgFWQms1qny1xehFQyb3V3xDy55M&s=oBSScH5uD5j0vCAaj4dfWepjiNGtHm9q5gA8eaIudZ4&e=
>>>>>
>>>>>   > and redirected here following Martin's suggestion)
>>>>>   >
>>>>>   > Thankyou :-)
>>>>>   >
>>>>>   > Aditya
>>>>>   >
>>>>>   > [[alternative HTML version deleted]]
>>>>>   >
>>>>>   > _______________________________________________
>>>>>   > Bioc-devel using r-project.org mailing list
>>>>>   >
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cXJaaEvfNbOioopXgFWQms1qny1xehFQyb3V3xDy55M&s=cEojiObibdSuzmh21opvy85DZyRrjtfo1vEMopKWmAg&e=
>>>>>
>>>>>   >
>>>>>
>>>>> --
>>>>> Hervé Pagès
>>>>>
>>>>> Program in Computational Biology
>>>>> Division of Public Health Sciences
>>>>> Fred Hutchinson Cancer Research Center
>>>>> 1100 Fairview Ave. N, M1-B514
>>>>> P.O. Box 19024
>>>>> Seattle, WA 98109-1024
>>>>>
>>>>> E-mail: hpages using fredhutch.org
>>>>> Phone: (206) 667-5791
>>>>> Fax: (206) 667-1319
>>>>
>>>> --
>>>> Hervé Pagès
>>>>
>>>> Program in Computational Biology
>>>> Division of Public Health Sciences
>>>> Fred Hutchinson Cancer Research Center
>>>> 1100 Fairview Ave. N, M1-B514
>>>> P.O. Box 19024
>>>> Seattle, WA 98109-1024
>>>>
>>>> E-mail: hpages using fredhutch.org
>>>> Phone:  (206) 667-5791
>>>> Fax:    (206) 667-1319
>>>>
>>>> _______________________________________________
>>>> Bioc-devel using r-project.org mailing list
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ca5pCXdCF2WpOOFAZlUeJVWFkiNt6X-kiDslxFP5AwM&s=yjyKVx-pEDq1h66xQ-uTvSa_f74lyyn31nY6cIDRvH4&e=
>>>>
>>>
>>>
>>>
>>> --
>>> Michael Lawrence
>>> Scientist, Bioinformatics and Computational Biology
>>> Genentech, A Member of the Roche Group
>>> Office +1 (650) 225-7760
>>> michafla using gene.com
>>>
>>> Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube
>>
>>
>>
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages using fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list