[Bioc-devel] Import BSgenome class without attaching BiocGenerics (and others)?

Pages, Herve hp@ge@ @end|ng |rom |redhutch@org
Wed Sep 11 17:29:47 CEST 2019


Or more accurately:

   as(seqinfo(bsgenome)[seqlevelsInUse(grl)], "GRanges")

since not all seqlevels are necessarily "in use" (i.e. not necessarily 
represented in seqnames(grl)).

H.

On 9/11/19 08:26, Hervé Pagès wrote:
> The unique seqnames is what we call the seqlevels. So just:
> 
>    as(seqinfo(bsgenome)[seqlevels(grl)], "GRanges")
> 
> H.
> 
> On 9/11/19 07:42, Michael Lawrence wrote:
>> So why not just do:
>>
>> as(seqinfo(bsgenome)[unique(unlist(seqnames(grl)))], "GRanges")
>>
>> Michael
>>
>> On Wed, Sep 11, 2019 at 5:55 AM Bhagwat, Aditya
>> <Aditya.Bhagwat using mpi-bn.mpg.de> wrote:
>>>
>>> Thanks Michael,
>>>
>>> The important detail is that I want to plot the relevant chromosomes 
>>> only
>>>
>>>      relevant_chromosomes <- GenomeInfoDb::seqnames(grangeslist)  %>%
>>>                              S4Vectors::runValue() %>%
>>>                              Reduce(union, .) %>%
>>>                              unique()
>>>
>>>      genomeranges <- GenomeInfoDb::seqinfo(grangeslist) %>%
>>>                      as('GRanges') %>%
>>>                     (function(gr){
>>>                         gr [ as.character(GenomeInfoDb::seqnames(gr)) 
>>> %in%
>>>                              relevant_chromosomes ]
>>>                     })
>>>
>>>      kp <- karyoploteR::plotKaryotype(genomeranges)
>>>      karyoploteR::kpPlotRegions(kp, grangeslist) # grangeslist 
>>> contains crispr target sites
>>>
>>>
>>> And, this process required as("GRanges")
>>>
>>>      #' Convert BSgenome into GRanges
>>>      #' @param from BSgenome, e.g. 
>>> BSgenome.Mmusculus.UCSC.mm10::Mmusculus
>>>      #' @examples
>>>      #' require(magrittr)
>>>      #' BSgenome.Mmusculus.UCSC.mm10::BSgenome.Mmusculus.UCSC.mm10 %>%
>>>      #' as('GRanges')
>>>      #' @importClassesFrom BSgenome BSgenome
>>>      #' @export
>>>      methods::setAs( "BSgenome",
>>>                      "GRanges",
>>>                      function(from)  from %>%
>>>                                      GenomeInfoDb::seqinfo() %>%
>>>                                      as('GRanges'))
>>>
>>> Thankyou for feedback,
>>>
>>> Aditya
>>>
>>> ________________________________________
>>> From: Michael Lawrence [lawrence.michael using gene.com]
>>> Sent: Wednesday, September 11, 2019 2:31 PM
>>> To: Bhagwat, Aditya
>>> Cc: Pages, Herve; bioc-devel using r-project.org
>>> Subject: Re: [Bioc-devel] Import BSgenome class without attaching 
>>> BiocGenerics (and others)?
>>>
>>> I'm pretty surprised that the karyoploteR package does not accept a
>>> Seqinfo since it is plotting chromosomes. But again, please consider
>>> just doing as(seqinfo(bsgenome), "GRanges").
>>>
>>> On Wed, Sep 11, 2019 at 3:59 AM Bhagwat, Aditya
>>> <Aditya.Bhagwat using mpi-bn.mpg.de> wrote:
>>>>
>>>> Hi Herve,
>>>>
>>>> Thank you for your responses.
>>>>  From your response, it is clear that the vcountPDict use case does 
>>>> not need a BSgenome -> GRanges coercer.
>>>>
>>>> The karyoploteR use case still requires it, though, to allow 
>>>> plotting of only the chromosomal BSgenome portions:
>>>>
>>>>      chromranges <- as(bsegenome, "GRanges")
>>>>      kp <- karyoploteR::plotKaryotype(chromranges)
>>>>      karyoploteR::kpPlotRegions(kp, crispr_target_sites)
>>>>
>>>> Or do you see any alternative for this purpose too?
>>>>
>>>> Aditya
>>>>
>>>> ________________________________________
>>>> From: Pages, Herve [hpages using fredhutch.org]
>>>> Sent: Wednesday, September 11, 2019 12:24 PM
>>>> To: Bhagwat, Aditya; bioc-devel using r-project.org
>>>> Subject: Re: [Bioc-devel] Import BSgenome class without attaching 
>>>> BiocGenerics (and others)?
>>>>
>>>> Hi Aditya,
>>>>
>>>> On 9/11/19 01:31, Bhagwat, Aditya wrote:
>>>>> Hi Herve,
>>>>>
>>>>>
>>>>>   > It feels that a coercion method from BSgenome to GRanges should
>>>>> rather be defined in the BSgenome package itself.
>>>>>
>>>>> :-)
>>>>>
>>>>>
>>>>>   > Patch/PR welcome on GitHub.
>>>>>
>>>>> Owkies. What pull/fork/check/branch protocol to be followed?
>>>>>
>>>>>
>>>>>   > Is this what you have in mind for this coercion?
>>>>>   > as(seqinfo(BSgenome.Celegans.UCSC.ce10), "GRanges")
>>>>>
>>>>> Yes.
>>>>>
>>>>> Perhaps also useful to share the wider context, allowing your and 
>>>>> others
>>>>> feedback for improved software design.
>>>>> I wanted to subset a
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124367&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=xNa-6ZKTD1MnnfT55tntHjdK51Y1JQGQxTlzX2-OYmI&e=>BSgenome 
>>>>>
>>>>> (without the _random or _unassigned), but Lori explained this is not
>>>>> possible.
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124367&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=xNa-6ZKTD1MnnfT55tntHjdK51Y1JQGQxTlzX2-OYmI&e=> 
>>>>>
>>>>>
>>>>> Instead Lori suggested to coerce a BSgenome into a GRanges
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_123489&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=6Eh73QthFfpPsfpRdPWs98pH6GHvv1Z23ORp34OCPxA&e=>, 
>>>>>
>>>>> which is a useful solution, but for which currently no exported S4
>>>>> method exists
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124416&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=H8owJlOQrNHwNFHfCxGHe27Jxu6xjxpuAMWK8JlTU4Y&e=> 
>>>>>
>>>>> So I defined an S4 coercer in my multicrispr package, making sure to
>>>>> properly import the Bsgenome class
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124442&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=2XNBVcwoJTjlxY_gl4UPzrHPKmKH9LTnM4ih5SQOfps&e=>. 
>>>>>
>>>>> Then, after coercing a BSgenome into a GRanges, I can extract the
>>>>> chromosomes, after properly importing IRanges::`%in%`
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124367&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=xNa-6ZKTD1MnnfT55tntHjdK51Y1JQGQxTlzX2-OYmI&e=> 
>>>>>
>>>>
>>>> Looks like you don't need to coerce the BSgenome object to GRanges. See
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_123489_-23124581&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ca5pCXdCF2WpOOFAZlUeJVWFkiNt6X-kiDslxFP5AwM&s=kwrPa77YhkAln44Cs7s5Egh_qr247FIVfcEYm52QOcI&e= 
>>>>
>>>>
>>>> H.
>>>>
>>>>> Which I can then on end to karyoploteR
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124328&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=M90_rBO1oohGnXe2XBpQHQriFNthY_W0hzN6KWlf2S4&e=>, 
>>>>>
>>>>> for genome-wide plots of crispr target sites.
>>>>>
>>>>> A good moment also to say thank you to all of you who helped me 
>>>>> out, it
>>>>> helps me to make multicrispr fit nicely into the BioC ecosystem.
>>>>>
>>>>> Speeking of BioC design philosophy, can any of you suggest concise and
>>>>> to-the-point reading material to deepen my understanding of the core
>>>>> BioC software design philosophy?
>>>>> I am trying to understand that better (which was the context for 
>>>>> asking
>>>>> recently why there are three Vector -> data.frame coercers in 
>>>>> S4Vectors
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124491&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=nBHdQoTrd1Mfu4VTMgtkPyUQ0Ju2NLeX-0X1Ny3fSeg&e=>) 
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Aditya
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ________________________________________
>>>>> From: Pages, Herve [hpages using fredhutch.org]
>>>>> Sent: Tuesday, September 10, 2019 6:45 PM
>>>>> To: Bhagwat, Aditya; bioc-devel using r-project.org
>>>>> Subject: Re: [Bioc-devel] Import BSgenome class without attaching
>>>>> BiocGenerics (and others)?
>>>>>
>>>>> Hi Aditya,
>>>>>
>>>>>
>>>>> More generally speaking, coercion methods should be defined in a place
>>>>> that is "as close as possible" to the "from" or "to" classes rather 
>>>>> than
>>>>> in a package that doesn't own any of the 2 classes involved.
>>>>> Is this what you have in mind for this coercion?
>>>>>
>>>>>   > as(seqinfo(BSgenome.Celegans.UCSC.ce10), "GRanges")
>>>>> GRanges object with 7 ranges and 0 metadata columns:
>>>>> seqnames ranges strand
>>>>> <Rle> <IRanges> <Rle>
>>>>> chrI chrI 1-15072423 *
>>>>> chrII chrII 1-15279345 *
>>>>> chrIII chrIII 1-13783700 *
>>>>> chrIV chrIV 1-17493793 *
>>>>> chrV chrV 1-20924149 *
>>>>> chrX chrX 1-17718866 *
>>>>> chrM chrM 1-13794 *
>>>>> -------
>>>>> seqinfo: 7 sequences (1 circular) from ce10 genome
>>>>>
>>>>> Thanks,
>>>>> H.
>>>>>
>>>>>
>>>>> On 9/6/19 03:39, Bhagwat, Aditya wrote:
>>>>>   > Dear Bioc devel,
>>>>>   >
>>>>>   > Is it possible to import the BSgenome class without attaching
>>>>> BiocGenerics (to keep a clean namespace during the development of
>>>>> multicrispr<https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.gwdg.de_loosolab_software_multicrispr&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cXJaaEvfNbOioopXgFWQms1qny1xehFQyb3V3xDy55M&s=MIR-kUeXy9oWokdQxItuG82hrvs0uwP1aBIqNdM-Jrs&e= 
>>>>>
>>>>>   >).
>>>>>   >
>>>>>   > BSgenome <- methods::getClassDef('BSgenome', package = 'BSgenome')
>>>>>   >
>>>>>   > (Posted earlier on BioC
>>>>> support<https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124442_&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cXJaaEvfNbOioopXgFWQms1qny1xehFQyb3V3xDy55M&s=oBSScH5uD5j0vCAaj4dfWepjiNGtHm9q5gA8eaIudZ4&e= 
>>>>>
>>>>>   > and redirected here following Martin's suggestion)
>>>>>   >
>>>>>   > Thankyou :-)
>>>>>   >
>>>>>   > Aditya
>>>>>   >
>>>>>   > [[alternative HTML version deleted]]
>>>>>   >
>>>>>   > _______________________________________________
>>>>>   > Bioc-devel using r-project.org mailing list
>>>>>   >
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cXJaaEvfNbOioopXgFWQms1qny1xehFQyb3V3xDy55M&s=cEojiObibdSuzmh21opvy85DZyRrjtfo1vEMopKWmAg&e= 
>>>>>
>>>>>   >
>>>>>
>>>>> -- 
>>>>> Hervé Pagès
>>>>>
>>>>> Program in Computational Biology
>>>>> Division of Public Health Sciences
>>>>> Fred Hutchinson Cancer Research Center
>>>>> 1100 Fairview Ave. N, M1-B514
>>>>> P.O. Box 19024
>>>>> Seattle, WA 98109-1024
>>>>>
>>>>> E-mail: hpages using fredhutch.org
>>>>> Phone: (206) 667-5791
>>>>> Fax: (206) 667-1319
>>>>
>>>> -- 
>>>> Hervé Pagès
>>>>
>>>> Program in Computational Biology
>>>> Division of Public Health Sciences
>>>> Fred Hutchinson Cancer Research Center
>>>> 1100 Fairview Ave. N, M1-B514
>>>> P.O. Box 19024
>>>> Seattle, WA 98109-1024
>>>>
>>>> E-mail: hpages using fredhutch.org
>>>> Phone:  (206) 667-5791
>>>> Fax:    (206) 667-1319
>>>>
>>>> _______________________________________________
>>>> Bioc-devel using r-project.org mailing list
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ca5pCXdCF2WpOOFAZlUeJVWFkiNt6X-kiDslxFP5AwM&s=yjyKVx-pEDq1h66xQ-uTvSa_f74lyyn31nY6cIDRvH4&e= 
>>>>
>>>
>>>
>>>
>>> -- 
>>> Michael Lawrence
>>> Scientist, Bioinformatics and Computational Biology
>>> Genentech, A Member of the Roche Group
>>> Office +1 (650) 225-7760
>>> michafla using gene.com
>>>
>>> Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube
>>
>>
>>
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages using fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319


More information about the Bioc-devel mailing list