[BioC] TEQC see all the reads with a given coverage
nathalie
nac at sanger.ac.uk
Wed May 16 16:08:47 CEST 2012
HI all,
I am using TEQC to have a look at coverage on my custom pull down
sequencing
#bam file
reads<-get.reads("./13.bam",filetype="bam")
#target file from which the baits has been designed:
targets<-get.targets("target.txt", chrcol=1,startcol=2,endcol=3,
zerobased=F, sep="\t",skip=0)
readpairs<-reads2pairs(reads)
#drop the pairs which are not matched:
reads<-reads[!(reads$ID %in% readpairs$singleReads$ID), , drop=TRUE]
#calculate the coverage
Coverage <- coverage.target(reads, targets, perTarget=T, perBase=T)
#from this coverage calculation I use the histogram function to plot the
coverage
coverage.hist(Coverage$coverageTarget, covthreshold=1000)
I have quite a high coverage as I am pulling down a small region and use
high seq lane to sequence . I have attached the coverage histograms, how
do you pull out reads with a specific coverage from that coverage object?
Another question, what does the fraction of target bases correspond to
(left Y axis on histogram plots)
Example: I want to see the position of all the reads with a coverage
below 500?
thanks
Nat
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] TEQC_2.4.0 hwriter_1.3 Rsamtools_1.8.4
[4] Biostrings_2.24.1 GenomicRanges_1.8.3 IRanges_1.14.2
[7] BiocGenerics_0.2.0
loaded via a namespace (and not attached):
[1] Biobase_2.16.0 bitops_1.0-4.1 stats4_2.15.0 zlibbioc_1.2.0
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: coverageHistograms_TEQC.pdf
Type: application/pdf
Size: 665466 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20120516/be4f76e2/attachment.pdf>
More information about the Bioconductor
mailing list