[Bioc-sig-seq] Mean coverage of capture regions

Fri Oct 9 22:55:53 CEST 2009

Hello All, 

I am attempting to calculate the mean number of reads for every exon that was represented on our custom Nimblegen capture chip. The capture regions don't line up exactly with the exon coordinates so I identified all exons that have an overlap, about 9500. I have previously done some crude sampling of the pileup output and my results were a lot better than what I seem to be calculating now. Currently I am running coverage() on my alignedRead object and then looping through the capture regions something like this:

curCov <- coverage(lane1)
for(q in c(1:22,"X","Y")){
	capturedExonsOnChrom <- capturedExons[which(capturedExons$chrom==q),]
	for(i in 1:length(capturedExonsOnChrom)) {
		exonCoordinates <- capturedExonsOnChrom[i]$exonStart:capturedExonsOnChrom[i]$exonEnd
		capturedExonMeanCoverage[] <- mean(as.vector(curCov[[q]])[exonCoordinates])
	}
}

First off, I am assuming that there is a more efficient method to do this. Is that the case? Also, I have also been using GenomeGraphs to create plots of the coverage and those plots also seem to show plenty of reads where my calculations above indicate none. I am hoping there is an error in my current method because the previous coverage numbers were quite encouraging. Thanks in advance for your help.

-------------------------------------
Karl Dykema
Computational Biologist
Van Andel Research Institute
This email message, including any attachments, is for th...{{dropped:6}}