[Bioc-devel] plyranges group_by

Michael Lawrence |@wrence@m|ch@e| @end|ng |rom gene@com
Wed Oct 16 13:54:59 CEST 2019


Just a note that in this particular case, selfmatch(annotatedsrf) would be
a fast way to generate a grouping vector, like
plyranges::group_by(annotatedsrf, selfmatch(annotatedsrf)).

Michael

On Wed, Oct 16, 2019 at 2:48 AM Bhagwat, Aditya <
Aditya.Bhagwat using mpi-bn.mpg.de> wrote:

> Hi Stuart, Michael,
>
> Your plyranges package is really cool - now I am using it for left joining
> GRanges (I am facing a minor issue there
> <https://support.bioconductor.org/p/125623/>, but that is not the topic
> of this email - I have been asked by Lori not to double-post :-)).
>
> This email is about the plyranges functionality for grouping GRanges.
> That is cool, but I found it to be not so performant for large numbers of
> ranges.
> My R session hangs when I do:
>
> bedfile <- paste0('
> https://gitlab.gwdg.de/loosolab/software/multicrispr/wikis',
>                       '/uploads/a51e98516c1e6b71441f5b5a5f741fa1/SRF.bed')
> srfranges <- rtracklayer::import.bed(bedfile, genome = 'mm10')
> txdb <- TxDb.Mmusculus.UCSC.mm10.ensGene::TxDb.Mmusculus.UCSC.mm10.ensGene
>     generanges <- GenomicFeatures::genes(txdb)
> annotatedsrf <- plyranges::join_overlap_left(srfranges, generanges)
> plyranges::group_by(annotatedsrf, seqnames, start, end, strand)
>
> For my purposes, I worked around it by performing a groupby in data.table:
>
> data.table::as.data.table(annotatedsrf)[
>     !is.na(gene_id),
>     gene_id := paste0(gene_id, collapse = ';'),
>     by = c('seqnames', 'start', 'end', 'strand'))
>
> And was wondering, in general, whether it would be useful to have a
> data.table-based backend for plyranges::groupby()
> And, whether all of this is actually a on-issue due to my improper use of
> plyranges::group_by properly.
>
> Thank you for feebdack :-)
>
> Aditya
>
>
>

-- 
Michael Lawrence
Scientist, Bioinformatics and Computational Biology
Genentech, A Member of the Roche Group
Office +1 (650) 225-7760
michafla using gene.com

Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list