[Bioc-sig-seq] ShortRead internal: too many 'snap' entries

Martin Morgan mtmorgan at fhcrc.org
Sun Apr 4 22:31:29 CEST 2010


On 04/04/2010 11:55 AM, Yanwei Tan wrote:
> Hi Ramzi Temanni,
> 
> I met the same problem with you when running shortread. As Martin
> mentioned, there is one new line missing after the last file record. How
> did you fix this problem? I do not know how to add a new line after the
> last line. My data is fastq file, I just filtered the reads which
> contain N by using the nFilter function in shortread package.

In off-list email you said

> I used ShortRead package to filter the data and then saved as fastq
> file. But when I run the qa function again there is error in
> .local(dirPath, pattern, ...): > >   ShortRead internal: too many
> 'snap' entries.

It is hard to follow what you are trying to accomplish. Please paste
short code to illustrate. Use data files from ShortRead, so that your
code is reproducible by others. Include the output of sessionInfo() so
that it is clear which version of software you are using. Perhaps after

  example(readFastq)

you do

> rfq
class: ShortReadQ
length: 256 reads; width: 36 cycles
> file = tempfile() # a file to save output
> noNrfq = rfq[nFilter()(rfq)]
> writeFastq(noNrfq, file)
> qaresult = qa(dirname(file), basename(file), type="fastq")

? But what is the problem? Note also that it is not necessary to write
the fastq file to disk,

> qa(list(noNrfq=noNrfq))
class: ShortReadQQA(9)
QA elements (access with qa[["elt"]]):
  readCounts: data.frame(1 3)
  baseCalls: data.frame(1 5)
  readQualityScore: data.frame(512 4)
  baseQuality: data.frame(94 3)
  alignQuality: data.frame(1 3)
  frequentSequences: data.frame(50 4)
  sequenceDistribution: data.frame(3 4)
  perCycle: list(2)
    baseCall: data.frame(141 4)
    quality: data.frame(341 5)
  perTile: list(2)
    readCounts: data.frame(0 4)
    medianReadQualityScore: data.frame(0 4)

This is my sessionInfo()

> sessionInfo()
R version 2.10.1 Patched (2010-03-27 r51570)
x86_64-unknown-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ShortRead_1.4.0    lattice_0.18-3     BSgenome_1.14.2
Biostrings_2.14.12
[5] IRanges_1.4.16

loaded via a namespace (and not attached):
[1] Biobase_2.6.1 grid_2.10.1   hwriter_1.2   tools_2.10.1

> 
> Many thanks in advance!
> Wei
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-sig-sequencing mailing list