[BioC] reducing hits from countGenomicOverlaps()
Robert Castelo
robert.castelo at upf.edu
Wed Oct 26 02:35:12 CEST 2011
dear list,
the following three lines allow one to count overlaps of aligned
short-reads with annotations:
aln <- readGappedAlignments("somebamfile.bam")
txdb <- makeTranscriptFromUCSC(genome="hg19", tablename="ensGene")
ensGenes <- exonsBy(txdb, by="gene")
ov <- countGenomicOverlaps(aln, ensGenes)
then i want to get read-counts per gene and the first thing that comes
to my head is doing:
counts <- sapply(ov, function(x) sum(values(x)[["hits"]]))
which goes through every gene and adds up the "hits" of its exons.
however, this latter step of "just adding" takes longer than the actual
calculation of the hits with countGenomicOverlaps() and i guess that
there are more efficient ways to approach this, probably something
around "reducing the hits value column". i've been looking at rdapply()
and reduce() and googled too, but couldn't find anything, so i look
forward to your suggestions.
thanks!!
robert.
More information about the Bioconductor
mailing list