[Bioc-sig-seq] The "ranges" slot in the sread slot of an AlignedRead class

Nicolas Delhomme delhomme at embl.de
Fri May 15 16:06:55 CEST 2009


Sorry, that's actually easier than that for defining the indices. I  
got confused by playing with them back and forth between the different  
objects. This (really) does the job:

mrna.ranges<-RangedData(
                          
IRanges(start=position(mrna.aln),width=width(mrna.aln)),
                         space = chromosome(mrna.aln),
                         universe = "dm3",
                         indices=seq(along=mrna.aln)
)

Have a nice WE,

---------------------------------------------------------------
Nicolas Delhomme

High Throughput Functional Genomics Center

European Molecular Biology Laboratory

Tel: +49 6221 387 8426
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------



On 15 May 2009, at 15:04, Martin Morgan wrote:

> Thanks Nicolas, that looks tidier! I'll talk to Michael a little to
> make sure I understand everything, and then implement this probably as
> a constructor (instead of coerce). Will let you; it might not be till
> next week. Martin
>
> Nicolas Delhomme <delhomme at embl.de> writes:
>
>> Hi Martin, Hi all,
>>
>> The following code does what I want quite nicely:
>>
>> mrna.ranges<-RangedData(
>> 	IRanges(start=position(mrna.aln),width=width(mrna.aln)),
>> 	space = chromosome(mrna.aln),
>> 	universe = "dm3",
>> 	indices=unlist(sapply(levels(chromosome(mrna.aln)),function(chr)
>> 	{which(chromosome(mrna.aln)==chr)}),use.name=FALSE)
>> )
>>
>> Out of my AlignedRead (mrna.aln), I create a RangedData which  
>> contains
>> the ranges splitted by chromosome and sorted into a RangesList. The
>> additional parameter: indices (the name is arbitrary) contains the
>> position of the corresponding read in the original mrna.aln object  
>> and
>> is stored in a SplitXDataFrame.
>>
>> Best,
>>
>> ---------------------------------------------------------------
>> Nicolas Delhomme
>>
>> High Throughput Functional Genomics Center
>>
>> European Molecular Biology Laboratory
>>
>> Tel: +49 6221 387 8426
>> Email: nicolas.delhomme at embl.de
>> Meyerhofstrasse 1 - Postfach 10.2209
>> 69102 Heidelberg, Germany
>> ---------------------------------------------------------------
>>
>>
>>
>> On 13 May 2009, at 16:53, Martin Morgan wrote:
>>
>>> Nicolas Delhomme <delhomme at embl.de> writes:
>>>
>>>> Hi Martin,
>>>>
>>>> That's what I thought; i was just curious to learn more. Thanks for
>>>> the details!
>>>>
>>>> I should have think of it, as I put it after the session info, that
>>>> most probably my second question will be invisible :-)
>>>>
>>>> I paste it here again:
>>>>
>>>>>
>>>>> And is there an easy way to create a RangesList from an  
>>>>> AlignedRead
>>>>> object? I figured out how to do it, but I just want to be sure
>>>>> that I
>>>>> didn't miss it. If it doesn't exist, I think it would be a  
>>>>> valuable
>>>>> addition and I could contribute the few lines of code.
>>>
>>> Sorry for missing that; I don't think there is anything built-in. We
>>> could exchange your code about introducing something off-list, if  
>>> you
>>> like.
>>>
>>> Martin
>>>
>>>> Best wishes,
>>>>
>>>> ---------------------------------------------------------------
>>>> Nicolas Delhomme
>>>>
>>>> High Throughput Functional Genomics Center
>>>>
>>>> European Molecular Biology Laboratory
>>>>
>>>> Tel: +49 6221 387 8426
>>>> Email: nicolas.delhomme at embl.de
>>>> Meyerhofstrasse 1 - Postfach 10.2209
>>>> 69102 Heidelberg, Germany
>>>> ---------------------------------------------------------------
>>>>
>>>>
>>>>
>>>> On 13 May 2009, at 04:30, Martin Morgan wrote:
>>>>
>>>>> Hi Nicolas --
>>>>>
>>>>> Nicolas Delhomme <delhomme at embl.de> writes:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Well the question is quite easy :-) What does this slot holds?
>>>>>> Because
>>>>>> it looks very different from the actual positions: i.e.
>>>>>>
>>>>>> these are the 10 first ranges
>>>>>>
>>>>>>> sread(aln.clean[chromosome(aln.clean)=="2R"])@ranges[1:10]
>>>>>
>>>>> It's internal to the way reads themselves are stored.
>>>>> sread(aln.clean)
>>>>> returns a DNAStringSet object, the ranges slot of a DNAStringSet
>>>>> points to offsets into a larger DNAString.  As you show later, you
>>>>> want to use position(aln.clean) for alignment information.
>>>>>
>>>>> This representation is meant to be entirely internal to the class.
>>>>> The
>>>>> intention is that the user manipulate objects with defined  
>>>>> functions
>>>>> and methods (like position()). Of course the user can get at the
>>>>> contents of slots with @, but there are no guarantees about what
>>>>> will
>>>>> be there if the user does this!.
>>>>>
>>>>> Martin
>>>>>
>>>>>
>>>>>> IRanges object:
>>>>>>    start  end width
>>>>>> [1]   4141 4176    36
>>>>>> [2]   4177 4212    36
>>>>>> [3]   4357 4392    36
>>>>>> [4]   4465 4500    36
>>>>>> [5]   5113 5148    36
>>>>>> [6]   5365 5400    36
>>>>>> [7]   5401 5436    36
>>>>>> [8]   6049 6084    36
>>>>>> [9]   6301 6336    36
>>>>>> [10]  6373 6408    36
>>>>>>
>>>>>> and these are the 10 first positions
>>>>>>
>>>>>>> position(aln.clean[chromosome(aln.clean)=="2R"])[1:10]
>>>>>> [1]  6419544 18694365 10064416 17228214  5850736 11976428  
>>>>>> 15335440
>>>>>> 3370962
>>>>>> [9] 15327509  3366816
>>>>>>
>>>>>>> sessionInfo()
>>>>>> R version 2.9.0 (2009-04-17)
>>>>>> x86_64-unknown-linux-gnu
>>>>>>
>>>>>> locale:
>>>>>> LC_CTYPE = en_US .UTF -8
>>>>>> ;LC_NUMERIC = C ;LC_TIME = en_US .UTF -8
>>>>>> ;LC_COLLATE = en_US .UTF -8
>>>>>> ;LC_MONETARY = C ;LC_MESSAGES = en_US .UTF -8
>>>>>> ;LC_PAPER = en_US .UTF -8
>>>>>> ;LC_NAME = C ;LC_ADDRESS
>>>>>> =C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>>>>>
>>>>>> attached base packages:
>>>>>> [1] stats     graphics  grDevices utils     datasets  methods
>>>>>> base
>>>>>>
>>>>>> other attached packages:
>>>>>> [1] ShortRead_1.2.0   lattice_0.17-22   BSgenome_1.12.0
>>>>>> Biostrings_2.12.1
>>>>>> [5] IRanges_1.2.1     rtracklayer_1.4.0 RCurl_0.94-1
>>>>>>
>>>>>> loaded via a namespace (and not attached):
>>>>>> [1] Biobase_2.4.1 grid_2.9.0    hwriter_1.1   tools_2.9.0
>>>>>> XML_2.3-0
>>>>>>
>>>>>> And is there an easy way to create a RangesList from an  
>>>>>> AlignedRead
>>>>>> object? I figured out how to do it, but I just want to be sure
>>>>>> that I
>>>>>> didn't miss it. If it doesn't exist, I think it would be a  
>>>>>> valuable
>>>>>> addition and I could contribute the few lines of code.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> ---------------------------------------------------------------
>>>>>> Nicolas Delhomme
>>>>>>
>>>>>> High Throughput Functional Genomics Center
>>>>>>
>>>>>> European Molecular Biology Laboratory
>>>>>>
>>>>>> Tel: +49 6221 387 8426
>>>>>> Email: nicolas.delhomme at embl.de
>>>>>> Meyerhofstrasse 1 - Postfach 10.2209
>>>>>> 69102 Heidelberg, Germany
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioc-sig-sequencing mailing list
>>>>>> Bioc-sig-sequencing at r-project.org
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>>>>
>>>>> -- 
>>>>> Martin Morgan
>>>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>>>> 1100 Fairview Ave. N.
>>>>> PO Box 19024 Seattle, WA 98109
>>>>>
>>>>> Location: Arnold Building M1 B861
>>>>> Phone: (206) 667-2793
>>>>
>>>
>>> -- 
>>> Martin Morgan
>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N.
>>> PO Box 19024 Seattle, WA 98109
>>>
>>> Location: Arnold Building M1 B861
>>> Phone: (206) 667-2793
>>
>
> -- 
> Martin Morgan
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793



More information about the Bioc-sig-sequencing mailing list