[Bioc-devel] applyPileup for introns with junction read
Martin Morgan
mtmorgan at fhcrc.org
Mon Sep 12 18:53:50 CEST 2011
Hi Tengfei --
On 09/11/2011 01:29 PM, Tengfei Yin wrote:
> Dear all,
>
> I apologize If this question goes to the wrong mailing list.
The Bioconductor mailing is probably more appropriate; your question is
about using a package, rather than developing a package. In terms of
your bug report...
> I am trying to finishing a function in my package that compute the mismatch
> summary, when it's exon regions, it looks like it matches well, but I don't
> know how to get the right count in the intron region, if there are some
> junction read over it, I guess I should get something for N and 0 for ACTG.
> But I get some weird result for that region
>
> Browse[1]> seq
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
> A 4 4 4 4 4 4 4 4 4 4 4
> C 0 0 0 0 0 0 0 0 0 0 0
> G 0 0 0 0 0 0 0 0 0 0 0
> T 2 2 2 2 2 2 2 2 2 2 2
> N 0 0 0 0 0 0 0 0 0 0 0
>
> Part of the code I use looks like
> pileupAsGRanges<- function(bams, regions,
> DNABases = c("A", "C", "G", "T", "N"),
> ...) {
> pileupFiles<- PileupFiles(bams)
> pileupParams<- PileupParam(which = regions, ...)
> ..........
> pileupFun<- function(x) {
> seq<- x$seq[DNABases,1,]
> ...
> }
> ........
> grlst<- applyPileups(pileupFiles, pileupFun, param = pileupParams)
> ...
> }
>
> I cannot give a reproducible example since it's based on particular Bam file
By way of reproducible example, I made a trivial SAM file and converted
it to bam with Rsamtools::asBam
Rsamtools::applyPileups was handling indels and reference skips
incorrectly; this likely influenced your other regions, too, and users
should re-do their analysis.
The bug is fixed in Rsamtools 1.5.59
Thanks for the report!
Martin
> on my laptop. But I test with IGV, it doesn't show any coverage over that
> region, then I see if I use scanBam to get seqs for that region I have 6
> reads, which has CIGAR like "35M10939N40M", but that shouldn't give all
> A or T like the result above, by the way, there is "simpleCigar" in
> ScanBamParam which can filter these reads, but there is no such argument in
> PileupParam.
>
> I don't know if this is caused by my mistake or inappropriate use of
> arguments, or what's the potential reason for that result.
>
> Thanks in advance.
>
> Tengfei
>
> sessionInfo()
> R Under development (unstable) (2011-08-08 r56671)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] Rsamtools_1.5.57 Biostrings_2.21.9 GenomicRanges_1.5.36
> [4] IRanges_1.11.26 rtracklayer_1.13.13 RCurl_1.6-10
> [7] bitops_1.0-4.1
>
> loaded via a namespace (and not attached):
> [1] BSgenome_1.21.3 tools_2.14.0 XML_3.4-2 zlibbioc_0.1.7
>
>
>
>
>
>
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioc-devel
mailing list