[BioC] countGenomicOverlaps output

Valerie Obenchain vobencha at fhcrc.org
Tue Sep 27 04:49:30 CEST 2011


Mete,

There is a new count function, summarizeOverlaps, in GenomicRanges that 
will replace countGenomicOverlaps in the upcoming release. I believe the 
usage is more straightforward and the options more useful than 
countGenomicOverlaps. Feedback welcome.

Valerie


On 07/23/11 10:27, Valerie Obenchain wrote:
> Hi Mete,
>
> In the context of countGenomicOverlaps, a GRangesLit is used to 
> represent reads with gaps in the CIGAR. The top level of a GRangesList 
> represents a single read and the list elements represent the multiple 
> segments of the read. Taking your qry object as an example, this list 
> would represent a single read from 10 to 110 that is broken into three 
> portions by gaps from 20-60 and 70-100. Only one of the three segments 
> overlaps with the subject, hence the score of 1/3.
>
> If I understand correctly, you have a GRangesList where each top level 
> is a read and the list elements are the multiple ranges where the read 
> aligns to the genome. In this example you have a read of length 10 
> that aligned to 3 different locations.
>
> If you want to identify when all list elements map to the subject you 
> could do something like
>
> query <- GRangesList(read1=GRanges(seq='1', IRanges(c(10,60,100), 
> c(20,70,110))),
>                                   read2=GRanges(seq='2', 
> IRanges(c(150,170), c(160,180))))
> sub <- GRangesList(feature1=GRanges(seq='1', IRanges(10,30)),
>                                 feature2=GRanges(seq='2', 
> IRanges(145,195)))
>
> olap <- countOverlaps(unlist(query), sub)
> elements <- elementLengths(query)
> lst <- split(olap, rep(seq_len(length(query)), elements))
> counts <- sapply(lst, sum)
> uniqueMap <- counts == elements
>
>
> Valerie
>
>
>
> On 07/22/11 11:55, Mete Civelek wrote:
>> Hi All,
>>
>> I want countGenomicOverlaps to output a count of uniquely mapping reads
>> within a genomic feature. Will setting the resolution parameter to 
>> 'none'
>> allow countGenomicOverlaps to ignore reads which map to multiple 
>> locations
>> in the genome? If so, countGenomicOverlaps doesn't behave the way I 
>> expect
>> it to. I am using the Bioconductor GenomicRanges package version 1.4.6.
>>
>> Example:
>>
>> library(GenomicRanges)
>> subj = GRangesList(feature1=GRanges(seq='1', IRanges(10,30), 
>> strand='+'))
>> qry = GRangesList(read1=GRanges(seq='1', 
>> IRanges(c(10,60,100),c(20,70,110)),
>> strand='+'))
>> countGenomicOverlaps(qry, subj, resolution='none')
>>
>> I would have expected the hit count to be 0 but instead it reports it as
>> 1/3. Am I using this function correctly?
>>
>> Thanks,
>>
>> Mete
>>
>>
>> IMPORTANT WARNING:  This email (and any attachments) is ...{{dropped:9}}
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list