[Bioc-devel] Import BSgenome class without attaching BiocGenerics (and others)?

Michael Lawrence |@wrence@m|ch@e| @end|ng |rom gene@com
Wed Sep 11 22:27:51 CEST 2019


Good call. Didn't know about seqlevelsInUse().

On Wed, Sep 11, 2019 at 8:29 AM Pages, Herve <hpages using fredhutch.org> wrote:
>
> Or more accurately:
>
>    as(seqinfo(bsgenome)[seqlevelsInUse(grl)], "GRanges")
>
> since not all seqlevels are necessarily "in use" (i.e. not necessarily
> represented in seqnames(grl)).
>
> H.
>
> On 9/11/19 08:26, Hervé Pagès wrote:
> > The unique seqnames is what we call the seqlevels. So just:
> >
> >    as(seqinfo(bsgenome)[seqlevels(grl)], "GRanges")
> >
> > H.
> >
> > On 9/11/19 07:42, Michael Lawrence wrote:
> >> So why not just do:
> >>
> >> as(seqinfo(bsgenome)[unique(unlist(seqnames(grl)))], "GRanges")
> >>
> >> Michael
> >>
> >> On Wed, Sep 11, 2019 at 5:55 AM Bhagwat, Aditya
> >> <Aditya.Bhagwat using mpi-bn.mpg.de> wrote:
> >>>
> >>> Thanks Michael,
> >>>
> >>> The important detail is that I want to plot the relevant chromosomes
> >>> only
> >>>
> >>>      relevant_chromosomes <- GenomeInfoDb::seqnames(grangeslist)  %>%
> >>>                              S4Vectors::runValue() %>%
> >>>                              Reduce(union, .) %>%
> >>>                              unique()
> >>>
> >>>      genomeranges <- GenomeInfoDb::seqinfo(grangeslist) %>%
> >>>                      as('GRanges') %>%
> >>>                     (function(gr){
> >>>                         gr [ as.character(GenomeInfoDb::seqnames(gr))
> >>> %in%
> >>>                              relevant_chromosomes ]
> >>>                     })
> >>>
> >>>      kp <- karyoploteR::plotKaryotype(genomeranges)
> >>>      karyoploteR::kpPlotRegions(kp, grangeslist) # grangeslist
> >>> contains crispr target sites
> >>>
> >>>
> >>> And, this process required as("GRanges")
> >>>
> >>>      #' Convert BSgenome into GRanges
> >>>      #' @param from BSgenome, e.g.
> >>> BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> >>>      #' @examples
> >>>      #' require(magrittr)
> >>>      #' BSgenome.Mmusculus.UCSC.mm10::BSgenome.Mmusculus.UCSC.mm10 %>%
> >>>      #' as('GRanges')
> >>>      #' @importClassesFrom BSgenome BSgenome
> >>>      #' @export
> >>>      methods::setAs( "BSgenome",
> >>>                      "GRanges",
> >>>                      function(from)  from %>%
> >>>                                      GenomeInfoDb::seqinfo() %>%
> >>>                                      as('GRanges'))
> >>>
> >>> Thankyou for feedback,
> >>>
> >>> Aditya
> >>>
> >>> ________________________________________
> >>> From: Michael Lawrence [lawrence.michael using gene.com]
> >>> Sent: Wednesday, September 11, 2019 2:31 PM
> >>> To: Bhagwat, Aditya
> >>> Cc: Pages, Herve; bioc-devel using r-project.org
> >>> Subject: Re: [Bioc-devel] Import BSgenome class without attaching
> >>> BiocGenerics (and others)?
> >>>
> >>> I'm pretty surprised that the karyoploteR package does not accept a
> >>> Seqinfo since it is plotting chromosomes. But again, please consider
> >>> just doing as(seqinfo(bsgenome), "GRanges").
> >>>
> >>> On Wed, Sep 11, 2019 at 3:59 AM Bhagwat, Aditya
> >>> <Aditya.Bhagwat using mpi-bn.mpg.de> wrote:
> >>>>
> >>>> Hi Herve,
> >>>>
> >>>> Thank you for your responses.
> >>>>  From your response, it is clear that the vcountPDict use case does
> >>>> not need a BSgenome -> GRanges coercer.
> >>>>
> >>>> The karyoploteR use case still requires it, though, to allow
> >>>> plotting of only the chromosomal BSgenome portions:
> >>>>
> >>>>      chromranges <- as(bsegenome, "GRanges")
> >>>>      kp <- karyoploteR::plotKaryotype(chromranges)
> >>>>      karyoploteR::kpPlotRegions(kp, crispr_target_sites)
> >>>>
> >>>> Or do you see any alternative for this purpose too?
> >>>>
> >>>> Aditya
> >>>>
> >>>> ________________________________________
> >>>> From: Pages, Herve [hpages using fredhutch.org]
> >>>> Sent: Wednesday, September 11, 2019 12:24 PM
> >>>> To: Bhagwat, Aditya; bioc-devel using r-project.org
> >>>> Subject: Re: [Bioc-devel] Import BSgenome class without attaching
> >>>> BiocGenerics (and others)?
> >>>>
> >>>> Hi Aditya,
> >>>>
> >>>> On 9/11/19 01:31, Bhagwat, Aditya wrote:
> >>>>> Hi Herve,
> >>>>>
> >>>>>
> >>>>>   > It feels that a coercion method from BSgenome to GRanges should
> >>>>> rather be defined in the BSgenome package itself.
> >>>>>
> >>>>> :-)
> >>>>>
> >>>>>
> >>>>>   > Patch/PR welcome on GitHub.
> >>>>>
> >>>>> Owkies. What pull/fork/check/branch protocol to be followed?
> >>>>>
> >>>>>
> >>>>>   > Is this what you have in mind for this coercion?
> >>>>>   > as(seqinfo(BSgenome.Celegans.UCSC.ce10), "GRanges")
> >>>>>
> >>>>> Yes.
> >>>>>
> >>>>> Perhaps also useful to share the wider context, allowing your and
> >>>>> others
> >>>>> feedback for improved software design.
> >>>>> I wanted to subset a
> >>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124367&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=xNa-6ZKTD1MnnfT55tntHjdK51Y1JQGQxTlzX2-OYmI&e=>BSgenome
> >>>>>
> >>>>> (without the _random or _unassigned), but Lori explained this is not
> >>>>> possible.
> >>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124367&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=xNa-6ZKTD1MnnfT55tntHjdK51Y1JQGQxTlzX2-OYmI&e=>
> >>>>>
> >>>>>
> >>>>> Instead Lori suggested to coerce a BSgenome into a GRanges
> >>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_123489&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=6Eh73QthFfpPsfpRdPWs98pH6GHvv1Z23ORp34OCPxA&e=>,
> >>>>>
> >>>>> which is a useful solution, but for which currently no exported S4
> >>>>> method exists
> >>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124416&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=H8owJlOQrNHwNFHfCxGHe27Jxu6xjxpuAMWK8JlTU4Y&e=>
> >>>>>
> >>>>> So I defined an S4 coercer in my multicrispr package, making sure to
> >>>>> properly import the Bsgenome class
> >>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124442&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=2XNBVcwoJTjlxY_gl4UPzrHPKmKH9LTnM4ih5SQOfps&e=>.
> >>>>>
> >>>>> Then, after coercing a BSgenome into a GRanges, I can extract the
> >>>>> chromosomes, after properly importing IRanges::`%in%`
> >>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124367&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=xNa-6ZKTD1MnnfT55tntHjdK51Y1JQGQxTlzX2-OYmI&e=>
> >>>>>
> >>>>
> >>>> Looks like you don't need to coerce the BSgenome object to GRanges. See
> >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_123489_-23124581&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ca5pCXdCF2WpOOFAZlUeJVWFkiNt6X-kiDslxFP5AwM&s=kwrPa77YhkAln44Cs7s5Egh_qr247FIVfcEYm52QOcI&e=
> >>>>
> >>>>
> >>>> H.
> >>>>
> >>>>> Which I can then on end to karyoploteR
> >>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124328&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=M90_rBO1oohGnXe2XBpQHQriFNthY_W0hzN6KWlf2S4&e=>,
> >>>>>
> >>>>> for genome-wide plots of crispr target sites.
> >>>>>
> >>>>> A good moment also to say thank you to all of you who helped me
> >>>>> out, it
> >>>>> helps me to make multicrispr fit nicely into the BioC ecosystem.
> >>>>>
> >>>>> Speeking of BioC design philosophy, can any of you suggest concise and
> >>>>> to-the-point reading material to deepen my understanding of the core
> >>>>> BioC software design philosophy?
> >>>>> I am trying to understand that better (which was the context for
> >>>>> asking
> >>>>> recently why there are three Vector -> data.frame coercers in
> >>>>> S4Vectors
> >>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124491&d=DwMFAw&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=FGFwBT0tJu3lfRS_rafeatLzrPxK7PEM0aanQY4M6wY&s=nBHdQoTrd1Mfu4VTMgtkPyUQ0Ju2NLeX-0X1Ny3fSeg&e=>)
> >>>>>
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> Aditya
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> ________________________________________
> >>>>> From: Pages, Herve [hpages using fredhutch.org]
> >>>>> Sent: Tuesday, September 10, 2019 6:45 PM
> >>>>> To: Bhagwat, Aditya; bioc-devel using r-project.org
> >>>>> Subject: Re: [Bioc-devel] Import BSgenome class without attaching
> >>>>> BiocGenerics (and others)?
> >>>>>
> >>>>> Hi Aditya,
> >>>>>
> >>>>>
> >>>>> More generally speaking, coercion methods should be defined in a place
> >>>>> that is "as close as possible" to the "from" or "to" classes rather
> >>>>> than
> >>>>> in a package that doesn't own any of the 2 classes involved.
> >>>>> Is this what you have in mind for this coercion?
> >>>>>
> >>>>>   > as(seqinfo(BSgenome.Celegans.UCSC.ce10), "GRanges")
> >>>>> GRanges object with 7 ranges and 0 metadata columns:
> >>>>> seqnames ranges strand
> >>>>> <Rle> <IRanges> <Rle>
> >>>>> chrI chrI 1-15072423 *
> >>>>> chrII chrII 1-15279345 *
> >>>>> chrIII chrIII 1-13783700 *
> >>>>> chrIV chrIV 1-17493793 *
> >>>>> chrV chrV 1-20924149 *
> >>>>> chrX chrX 1-17718866 *
> >>>>> chrM chrM 1-13794 *
> >>>>> -------
> >>>>> seqinfo: 7 sequences (1 circular) from ce10 genome
> >>>>>
> >>>>> Thanks,
> >>>>> H.
> >>>>>
> >>>>>
> >>>>> On 9/6/19 03:39, Bhagwat, Aditya wrote:
> >>>>>   > Dear Bioc devel,
> >>>>>   >
> >>>>>   > Is it possible to import the BSgenome class without attaching
> >>>>> BiocGenerics (to keep a clean namespace during the development of
> >>>>> multicrispr<https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.gwdg.de_loosolab_software_multicrispr&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cXJaaEvfNbOioopXgFWQms1qny1xehFQyb3V3xDy55M&s=MIR-kUeXy9oWokdQxItuG82hrvs0uwP1aBIqNdM-Jrs&e=
> >>>>>
> >>>>>   >).
> >>>>>   >
> >>>>>   > BSgenome <- methods::getClassDef('BSgenome', package = 'BSgenome')
> >>>>>   >
> >>>>>   > (Posted earlier on BioC
> >>>>> support<https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_124442_&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cXJaaEvfNbOioopXgFWQms1qny1xehFQyb3V3xDy55M&s=oBSScH5uD5j0vCAaj4dfWepjiNGtHm9q5gA8eaIudZ4&e=
> >>>>>
> >>>>>   > and redirected here following Martin's suggestion)
> >>>>>   >
> >>>>>   > Thankyou :-)
> >>>>>   >
> >>>>>   > Aditya
> >>>>>   >
> >>>>>   > [[alternative HTML version deleted]]
> >>>>>   >
> >>>>>   > _______________________________________________
> >>>>>   > Bioc-devel using r-project.org mailing list
> >>>>>   >
> >>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cXJaaEvfNbOioopXgFWQms1qny1xehFQyb3V3xDy55M&s=cEojiObibdSuzmh21opvy85DZyRrjtfo1vEMopKWmAg&e=
> >>>>>
> >>>>>   >
> >>>>>
> >>>>> --
> >>>>> Hervé Pagès
> >>>>>
> >>>>> Program in Computational Biology
> >>>>> Division of Public Health Sciences
> >>>>> Fred Hutchinson Cancer Research Center
> >>>>> 1100 Fairview Ave. N, M1-B514
> >>>>> P.O. Box 19024
> >>>>> Seattle, WA 98109-1024
> >>>>>
> >>>>> E-mail: hpages using fredhutch.org
> >>>>> Phone: (206) 667-5791
> >>>>> Fax: (206) 667-1319
> >>>>
> >>>> --
> >>>> Hervé Pagès
> >>>>
> >>>> Program in Computational Biology
> >>>> Division of Public Health Sciences
> >>>> Fred Hutchinson Cancer Research Center
> >>>> 1100 Fairview Ave. N, M1-B514
> >>>> P.O. Box 19024
> >>>> Seattle, WA 98109-1024
> >>>>
> >>>> E-mail: hpages using fredhutch.org
> >>>> Phone:  (206) 667-5791
> >>>> Fax:    (206) 667-1319
> >>>>
> >>>> _______________________________________________
> >>>> Bioc-devel using r-project.org mailing list
> >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ca5pCXdCF2WpOOFAZlUeJVWFkiNt6X-kiDslxFP5AwM&s=yjyKVx-pEDq1h66xQ-uTvSa_f74lyyn31nY6cIDRvH4&e=
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Michael Lawrence
> >>> Scientist, Bioinformatics and Computational Biology
> >>> Genentech, A Member of the Roche Group
> >>> Office +1 (650) 225-7760
> >>> michafla using gene.com
> >>>
> >>> Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube
> >>
> >>
> >>
> >
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages using fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319



-- 
Michael Lawrence
Scientist, Bioinformatics and Computational Biology
Genentech, A Member of the Roche Group
Office +1 (650) 225-7760
michafla using gene.com

Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube



More information about the Bioc-devel mailing list