[Bioc-devel] Problem with seqnames of TwoBitFile from AnnotationHub

Michael Lawrence lawrence.michael at gene.com
Fri Jan 8 14:40:19 CET 2016


This is perhaps something that could be handled when population the
hub, but I'm not sure how rtracklayer could automatically derive the
chromosome names.

On Fri, Jan 8, 2016 at 2:37 AM, Rainer Johannes
<Johannes.Rainer at eurac.edu> wrote:
> dear all,
>
> I just run into a problem with a TwoBitFile I fetched from AnnotationHub. I was fetching a TwoBitFile with the genomic DNA sequence, as provided by Ensembl:
>
>> library(AnnotationHub)
>> ah <- AnnotationHub()
>> tbf <- ah[["AH50068”]]
>
>> head(seqnames(seqinfo(tbf)))
> [1] "1 dna:chromosome chromosome:GRCh38:1:1:248956422:1 REF"
> [2] "10 dna:chromosome chromosome:GRCh38:10:1:133797422:1 REF"
> [3] "11 dna:chromosome chromosome:GRCh38:11:1:135086622:1 REF"
> [4] "12 dna:chromosome chromosome:GRCh38:12:1:133275309:1 REF"
> [5] "13 dna:chromosome chromosome:GRCh38:13:1:114364328:1 REF"
> [6] "14 dna:chromosome chromosome:GRCh38:14:1:107043718:1 REF"
>
> Would be nice, if the seqnames would be really just the chromsome names and not the whole string from the FA file header. Is there a way I could fix the file myself or is this something that should be fixed in the rtracklayer or AnnotationHub package when the TwoBitFile is created?
>
> thanks, jo
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list