[BioC] Coverage by base
Steve Lianoglou
mailinglist.honeypot at gmail.com
Wed Oct 12 18:35:16 CEST 2011
Hi,
2011/10/12 Hervé Pagès <hpages at fhcrc.org>:
> Hi Rohan,
>
> On 11-10-11 11:49 AM, rohan bareja wrote:
>>
>> Hi,
>> Ya thats correct..I have summed up the
>> reads using viewSums but I want to get the gene information .Hi,
>> I found this archive from bioconductor-sig-sequencing mailing list where
>> they have discussed about cov_by_gene<- Views(cov, genes)
>>
>> viewSums(cov_by_gene)https://stat.ethz.ch/pipermail/bioc-sig-sequencing/attachments/20110723/b9f2c69c/attachment.pl
>> But I am not sure what is the object "genes" that has been used along with
>> the coverage.Thats exactly what I would like to do.
>
> I think 'genes' is a GRanges object object that was obtained with
> something like (pseudo-code):
>
> library(GenomicFeatures)
> txdb <- makeTranscriptDbFromUCSC(...)
> gene_ranges <- range(transcriptsBy(txdb, by="gene"))
> genes <- unlist(gene_ranges)
I'm not so sure?
I can't find any Views(*) methods defined for GRanges objects, but
maybe I'm missing something.
If that's the case, I'd just go the manual route.
Rohan: let's assume you have a SimpleRleList named `cvr` which
represents the coverage vectors for each chromosome in your bam file,
something like:
R> cvr
SimpleRleList of length 25
$chr1
'integer' Rle of length 249250621 with 249520 runs
Lengths: 14362 4 23 1 5 ... 25 974 26 12758
Values : 0 4 5 6 5 ... 1 0 1 0
$chr2
'integer' Rle of length 243199373 with 179635 runs
Lengths: 10103 31 3803 43 21102 ... 27 4543 23 19174
Values : 0 1 0 1 0 ... 1 0 1 0
And say that you've done what Herve has suggested where you have a
GRanges object names `genes` that has all of your gene info:
R> genes
seqnames ranges strand |
<Rle> <IRanges> <Rle> |
1.1 chr19 [ 58858172, 58864865] - |
10.2 chr8 [ 18248755, 18258723] + |
100.3 chr20 [ 43248163, 43280376] - |
1000.4 chr18 [ 25530930, 25757445] - |
10000.5 chr1 [243651535, 244006886] - |
100008586.6 chrX [ 49217771, 49342266] + |
...
I'd then loop over the `names(cvr)` and do your counting as you like, maybe:
R> info <- lapply(names(cvr), function(chr) {
g <- genes[seqnames(genes) == chr]
v <- Views(cvr[[chr]], ranges(g))
viewSums(v)
})
I'd imagine you will want to do more book-keeping than what I'm
showing, but here's a start.
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor
mailing list