[BioC] readPileup() Error
Martin Morgan
mtmorgan at fhcrc.org
Fri Nov 11 16:27:55 CET 2011
On 11/10/2011 02:41 AM, Vincent Carey wrote:
> I think I can confirm this using the ex1.pileup in the examples folder of
> the samtools svn distro, revision 998,
> which says it is 0.1.18 -- BUT READ ON
>
>> Z = readPileup("ex1.pileup", variant="SNP")
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
> :
> scan() expected 'an integer', got '^~.'
>
> Enter a frame number, or 0 to exit
>
> 1: readPileup("ex1.pileup", variant = "SNP")
> 2: readPileup("ex1.pileup", variant = "SNP")
> 3: callGeneric(conn, ...)
> 4: eval(call, sys.frame(sys.parent()))
> 5: eval(expr, envir, enclos)
> 6: readPileup(conn, variant = "SNP")
> 7: readPileup(conn, variant = "SNP")
> 8: .local(file, ...)
> 9: .readPileup_SNP(file = file, ..., variant = variant)
> 10: .readPileup_table(file, colClasses, ...)
> 11: read.table(conn, colClasses = colClasses, col.names =
> names(colClasses), se
> 12: scan(file = file, what = what, sep = sep, quote = quote, dec = dec,
> nmax =
>
>
> I believe the readPileup function was a very early contribution as
> Rsamtools emerged. Note that its man page references pileup -cv as the
> generator for input, but that samtools command is removed in favor of
> mpileup. So at best readPileup is retained for legacy applications.
>
> Now, for pileup analysis with Rsamtools, you should use the PileupFile()
> interface direct to BAM and indices, and applyPileups() to extract the base
> frequency or other functionals of pileups. Do not create the pileups
> separately with Rsamtools, applyPileups() will compute them internally.
>
> On Thu, Nov 10, 2011 at 3:10 AM, Ron Ophir<ron at volcani.agri.gov.il> wrote:
>
>> Dear BMMs,
>>
>> The session information is listed below.
>> When I read a pileup file (readPileup) I got as an output of samtools
>> 0.1.18, I get the following error
>>
>> Snp = readPileup(pileupFile,variant="SNP")
>>
>> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
>> na.strings, :
>> scan() expected 'an integer', got '^<,'
>>
>> ------------------------------------------------------------------------
>> ------
>>
>> The pileup format in the example of readPileup() function is
>>
>> 6 33701 A G 21 21 12 35
>> ,,,,$,,,,.,,,,.,..,..g.,,gGggggGgggg
>> ??@AA3>=3((>A?<-9<-8?9A>87;%6148+=2
>>
>> The pileup format at samtools web pages is
>>
>> seq1 272 T 24 ,.$.....,,.,.,...,,,.,..^+.<<<+;<<<<<<<<<<<=<;<;7<&
>>
>> and in my pileup is
>>
>> Cs1,6RhaT 332 A 9 ,,,..^;,^E,^Z.^Z.
>> DCCFFCDCC
>>
>> It seems to me that the error is obvious, however I don't know whether
>> it is a bug or a feature. If it is a feature how to run readPileup
>> correctly to be able reading latest pileup format?
Hi Ron, Vince --
Not exactly sure where this stands. From Ron's post and as Vince
indicates I think the problem is that Ron's pileup file does not conform
to what readPileups requires (produced with samtools pileup -cv ...),
and furthermore that this file format is not produced by current samtools.
Vince mentions applyPileup; another route from mpileup is the -g option
to generate BCF files, and then indexBcf / scanBcf. For instance in the
samtools/examples folder
> library(Rsamtools)
> indexBcf("ex1.bcf")
> res = scanBcf("ex1.bcf")
The output format is 'scan'-like so not 100% user friendly; see
example(scanBcf).
Vince mentions ex1.pileup from samtools as a reproducible example. This
is generated by running make in samtools/examples, and with that svn
revision produces a zero-length file (since pileup is no longer
supported). Instructions to create the file are still in
samtools/examples/Makefile, but use the options -cf so would not be
expected to produce a readPileup-compatible file.
There is example(readPileup) which points to 'pileup.txt' and which
parses without error.
library(Rsamtools)
example(readPileup)
readPileup(fl, variant="SNP")
Martin
>>
>> Thanks,
>> Ron
>>
>> Abbreviation
>>
>> BMM bioconductor mailing-list member
>>
>>
>> sessionInfo()
>> R version 2.14.0 (2011-10-31)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
>> [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8
>> [5] LC_MONETARY=en_US.utf8 LC_MESSAGES=en_US.utf8
>> [7] LC_PAPER=C LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>>
>> other attached packages:
>> [1] Rsamtools_1.6.1 Biostrings_2.22.0 GenomicRanges_1.6.2
>> [4] IRanges_1.12.1
>>
>> loaded via a namespace (and not attached):
>> [1] bitops_1.0-4.1 BSgenome_1.22.0 RCurl_1.7-0
>> rtracklayer_1.14.1
>> [5] tools_2.14.0 XML_3.4-3 zlibbioc_1.0.0
>>
>>
>> This mail was sent via Mail-SeCure System.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioconductor
mailing list