[Bioc-devel] should genome() be so complicated?/add genome report to GRanges show method
Michael Lawrence
lawrence.michael at gene.com
Mon Sep 8 20:41:09 CEST 2014
I might have requested the genome annotation, but I'm pretty sure it wasn't
me who decide on tracking it on a per-sequence basis. I could imagine use
cases for that though, e.g., when diagnosing sequencing contamination (like
human vs. mouse). But most other tools and file formats expect a single
genome per "track", so, for example, rtracklayer has an internal function
singleGenome() to take care of this.
On Mon, Sep 8, 2014 at 10:50 AM, Hervé Pagès <hpages at fhcrc.org> wrote:
> Hi Vince,
>
> Yes it would make sense to have the "show" method report the genome
> when genome(x) contains a unique non-NA value. I think the main
> use case for having the genome defined at the sequence level instead
> of the whole object level is metagenomics. Maybe Michael has some other
> good use cases to share since IIRC he requested the addition of the
> genome field a couple of years ago and made the case for having it
> defined at the sequence level.
>
> Cheers,
> H.
>
>
> On 09/08/2014 07:21 AM, Vincent Carey wrote:
>
>> For GRanges x, my naive expectation is that genome(x) returns a length-
>>
>> one tag identifying the genome to which chromosomal coordinates
>>
>> correspond. The genome() method seems to have sequence-specific
>>
>> semantics, which makes sense, but when we identify sequence
>>
>> with chromosome, it seems too complicated. Is there a use case for
>>
>> a GRanges with sequences from several different genomes?
>>
>>
>> One reason I am inquiring is that I feel it would be nice to have the
>> GRanges show() method report, prominently, the genome in use (or NA
>>
>> if unspecified). This could be accomplished by reporting
>> unique(genome(x)), and perhaps that would be satisfactory.
>>
>> after example(genome) :
>>
>> seqinfo(txdb)
>>>
>>
>> Seqinfo of length 15
>>
>> seqnames seqlengths isCircular genome
>>
>> CH2L 23011544 FALSE dm3
>>
>> CH2R 21146708 FALSE dm3
>>
>> CH3L 24543557 FALSE dm3
>>
>> CH3R 27905053 FALSE dm3
>>
>> CH4 1351857 FALSE dm3
>>
>> ... ... ... ...
>>
>> CH3LHet 2555491 FALSE dm3
>>
>> CH3RHet 2517507 FALSE dm3
>>
>> CHXHet 204112 FALSE dm3
>>
>> CHYHet 347038 FALSE dm3
>>
>> CHUextra 29004656 FALSE dm3
>>
>> genome(seqinfo(txdb))
>>>
>>
>> CH2L CH2R CH3L CH3R CH4 CHX CHU M
>>
>> "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" "dm3"
>>
>> CH2LHet CH2RHet CH3LHet CH3RHet CHXHet CHYHet CHUextra
>>
>> "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" "dm3"
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
>
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list