[Bioc-devel] should genome() be so complicated?/add genome report to GRanges show method
Peter Hickey
hickey at wehi.EDU.AU
Mon Sep 8 23:28:48 CEST 2014
Just a vote for still allowing for multiple genomes in a Seqinfo object (in a GRanges object). My use case is in bisulfite-sequencing experiments where there is often a spike-in of a lambda phage genome along with the genome of interest (human or mouse). It's often useful to keep all data from a single library together in the same objet but process according to genome(x) for each seqlevel.
FWIW, I like Vincent's proposal of selectSome(unique(genome(x))) in the show method.
Cheers,
Pete
> I might have requested the genome annotation, but I'm pretty sure it wasn't
> me who decide on tracking it on a per-sequence basis. I could imagine use
> cases for that though, e.g., when diagnosing sequencing contamination (like
> human vs. mouse). But most other tools and file formats expect a single
> genome per "track", so, for example, rtracklayer has an internal function
> singleGenome() to take care of this.
>
> On Mon, Sep 8, 2014 at 10:50 AM, Herv? Pag?s <hpages at fhcrc.org> wrote:
>
>> Hi Vince,
>>
>> Yes it would make sense to have the "show" method report the genome
>> when genome(x) contains a unique non-NA value. I think the main
>> use case for having the genome defined at the sequence level instead
>> of the whole object level is metagenomics. Maybe Michael has some other
>> good use cases to share since IIRC he requested the addition of the
>> genome field a couple of years ago and made the case for having it
>> defined at the sequence level.
>>
>> Cheers,
>> H.
>>
>>
>> On 09/08/2014 07:21 AM, Vincent Carey wrote:
>>
>>> For GRanges x, my naive expectation is that genome(x) returns a length-
>>>
>>> one tag identifying the genome to which chromosomal coordinates
>>>
>>> correspond. The genome() method seems to have sequence-specific
>>>
>>> semantics, which makes sense, but when we identify sequence
>>>
>>> with chromosome, it seems too complicated. Is there a use case for
>>>
>>> a GRanges with sequences from several different genomes?
>>>
>>>
>>> One reason I am inquiring is that I feel it would be nice to have the
>>> GRanges show() method report, prominently, the genome in use (or NA
>>>
>>> if unspecified). This could be accomplished by reporting
>>> unique(genome(x)), and perhaps that would be satisfactory.
>>>
>>> after example(genome) :
>>>
>>> seqinfo(txdb)
>>>>
>>>
>>> Seqinfo of length 15
>>>
>>> seqnames seqlengths isCircular genome
>>>
>>> CH2L 23011544 FALSE dm3
>>>
>>> CH2R 21146708 FALSE dm3
>>>
>>> CH3L 24543557 FALSE dm3
>>>
>>> CH3R 27905053 FALSE dm3
>>>
>>> CH4 1351857 FALSE dm3
>>>
>>> ... ... ... ...
>>>
>>> CH3LHet 2555491 FALSE dm3
>>>
>>> CH3RHet 2517507 FALSE dm3
>>>
>>> CHXHet 204112 FALSE dm3
>>>
>>> CHYHet 347038 FALSE dm3
>>>
>>> CHUextra 29004656 FALSE dm3
>>>
>>> genome(seqinfo(txdb))
>>>>
>>>
>>> CH2L CH2R CH3L CH3R CH4 CHX CHU M
>>>
>>> "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" "dm3"
>>>
>>> CH2LHet CH2RHet CH3LHet CH3RHet CHXHet CHYHet CHUextra
>>>
>>> "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" "dm3"
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>>
>> --
>> Herv? Pag?s
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fhcrc.org
>> Phone: (206) 667-5791
>> Fax: (206) 667-1319
>>
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
--------------------------------
Peter Hickey,
PhD Student/Research Assistant,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.
Ph: +613 9345 2324
hickey at wehi.edu.au
http://www.wehi.edu.au
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}
More information about the Bioc-devel
mailing list