[BioC] How to convert from IRanges(List) to Rle(List)
Nicolas Delhomme
delhomme at embl.de
Sun Apr 8 14:17:44 CEST 2012
Hi Michael,
That sounds really good!
When you talk about refactoring the transcriptLocsToRefLocs function, what do you mean exactly? I didn't find the interface so hard to understand, took me ~5 mins to figure it out. Some error message could be more explicit though, e.g. I got the following when tlocs was a list of numeric vectors instead of a list of integer vectors:
Error in .Call2("tlocs2rlocs", tlocs, exonStarts, exonEnds, strand, decreasing.rank.on.minus.strand, :
'tlocs' has invalid elements
but that was all really.
Nico
---------------------------------------------------------------
Nicolas Delhomme
Genome Biology Computational Support
European Molecular Biology Laboratory
Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------
On 8 Apr 2012, at 07:45, Michael Lawrence wrote:
> On Sat, Apr 7, 2012 at 7:31 PM, Valerie Obenchain <vobencha at fhcrc.org>wrote:
>
>> On 04/07/12 16:30, Michael Lawrence wrote:
>>
>>> On Sat, Apr 7, 2012 at 11:12 AM, Martin Morgan<mtmorgan at fhcrc.org>
>>> wrote:
>>>
>>> On 04/07/2012 05:39 AM, Nicolas Delhomme wrote:
>>>>
>>>> Hi all,
>>>>>
>>>>> I'm just wondering if there would be a direct way to convert an
>>>>> IRanges to an Rle, as in: as(rng,"Rle"). At the moment, I can convert
>>>>> my IRanges into an integer vector and cast that as an Rle
>>>>> (Rle(as.integer(rng)), but that is not extremely efficient on a long
>>>>> IRangesList (with> 700,000 IRanges in it). Takes ~10 mins with an
>>>>> sapply.
>>>>>
>>>>> Why I want that is for the following: I have an IRangesList of
>>>>> transcripts (describing exons at the genome level) and for every one,
>>>>> I have a bp position at the transcript level that I want to convert
>>>>> into a genomic bp position. Basically, I need to be able to convert a
>>>>> given transcript coordinate into the corresponding genomic
>>>>> coordinate. My IRanges contain the genomic coordinates of every
>>>>> transcript and by converting it into an integer vector, I can select
>>>>> the right genomic bp coordinate by using the transcript bp coordinate
>>>>> as an index (as.integer(rng)[transcript.****pos]).
>>>>>
>>>>>
>>>>> I considered the IRanges approach because I keep the transcript name
>>>>> and I'm sure that I looking up the right coord in the right
>>>>> transcript, but I'm open to other suggestions.
>>>>>
>>>>> Hi Nico -- VariantAnnotation::****refLocsToLocalLocs,
>>>> GenomicFeatures::****transcriptLocs2refLocs
>>>>
>>>> and IRanges::map might do this for you; no direct experience on my part,
>>>> though. Martin
>>>>
>>>>
>>>> Right. Right now, IRanges::map will take things from global to local
>>> (either into transcripts or reads, depending on the argument). This takes
>>> the place of "refLocsToLocalLocs". What "map" needs to support is the
>>> reverse. I think we could do this with either a new function. I am not
>>> sure
>>> if it should be called reverseMap though, because it's not clear which is
>>> forward and which is reverse. Maybe we need mapToGlobal and mapToLocal? Or
>>> maybe "absolute" and "relative" are better terms?
>>>
>>> Btw, we are working on an "easier to use" interface for the
>>> transcriptLocsToRefLocs function and that should be integrated with any
>>> refactoring/renaming.
>>>
>> I like the idea of the map generic and where it is going. I think the
>> mapToGlobal and mapToLocal terms are more clear. Assuming in mapToGlobal
>> the 'from' would be along the lines of cDNA-based, cds-based, or
>> protein-based coordinates. In mapToLocal the 'from' would always be
>> genomic-based coordinates. Yes?
>>
>>
> Yes, that would be the typical use case, although the generic is meant to
> be more general, i.e., it is in IRanges, not GenomicRanges.
>
>
>> Valerie
>>
>>
>>> Let's get a discussion going.
>>>
>>> Michael
>>>
>>>
>>> Thanks for any pointers,
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Nico
>>>>>
>>>>> ------------------------------****----------------------------**--**---
>>>>>
>>>>> Nicolas Delhomme
>>>>>
>>>>> Genome Biology Computational Support
>>>>>
>>>>> European Molecular Biology Laboratory
>>>>>
>>>>> Tel: +49 6221 387 8310 Email: nicolas.delhomme at embl.de
>>>>> Meyerhofstrasse 1 - Postfach 10.2209 69102 Heidelberg, Germany
>>>>>
>>>>> ______________________________****_________________ Bioconductor
>>>>> mailing
>>>>> list Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/****listinfo/bioconductor<https://stat.ethz.ch/mailman/**listinfo/bioconductor>
>>>>> <https:/**/stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>>Search
>>>>> the
>>>>> archives:
>>>>> http://news.gmane.org/gmane.****science.biology.informatics.****
>>>>> conductor<http://news.gmane.org/gmane.**science.biology.informatics.**conductor>
>>>>> <http://news.gmane.**org/gmane.science.biology.**informatics.conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>>>
>>>>>
>>>>>
>>>> --
>>>> Computational Biology
>>>> Fred Hutchinson Cancer Research Center
>>>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>>>
>>>> Location: M1-B861
>>>> Telephone: 206 667-2793
>>>>
>>>>
>>>> ______________________________****_________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/****listinfo/bioconductor<https://stat.ethz.ch/mailman/**listinfo/bioconductor>
>>>> <https:/**/stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>>>>
>>>> Search the archives: http://news.gmane.org/gmane.**
>>>> science.biology.informatics.****conductor<http://news.gmane.**
>>>> org/gmane.science.biology.**informatics.conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>
>>>
>>> ______________________________**_________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>> Search the archives: http://news.gmane.org/gmane.**
>>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>
>>
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list