[Bioc-sig-seq] ShortRead internal: too many 'snap' entries

Yanwei Tan Tan at nbio.uni-heidelberg.de
Sun Apr 4 23:07:29 CEST 2010


Dear Martin,

I use nFilter to filter out the sequences which contain any "N", 
following is my codes:

 > # read the fastq file
 > fq<-readFastq("/Users/wei/Desktop/Originaldata",pattern="Bic.txt")
 > # filter for N containing reads
 > filt<-nFilter()
 > fq<-fq[filt(fq)]
 > # write the out
 > writeFastq(fq,file="/Users/wei/Desktop/Originaldata/bicfiltered.txt")


After I got the filtered fastq file:

 >readFastq("/Users/wei/Desktop/Originaldata", "bicfiltered.txt")
Error in  .local(dirPath, pattern,...) :
     ShortRead internal: too many 'snap' entries

My sessioninfo():
R version 2.10.1 (2009-12-14)
x86_64-apple-darwin9.8.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
other attached packages:
[1] ShortRead_1.4.0    lattice_0.17-26    BSgenome_1.14.2    
Biostrings_2.14.12 IRanges_1.4.11
loaded via a namespace (and not attached):
[1] Biobase_2.6.1 grid_2.10.1   hwriter_1.1   tools_2.10.1

Many thanks!
Wei


On 4/4/10 10:31 PM, Martin Morgan wrote:
> On 04/04/2010 11:55 AM, Yanwei Tan wrote:
>    
>> Hi Ramzi Temanni,
>>
>> I met the same problem with you when running shortread. As Martin
>> mentioned, there is one new line missing after the last file record. How
>> did you fix this problem? I do not know how to add a new line after the
>> last line. My data is fastq file, I just filtered the reads which
>> contain N by using the nFilter function in shortread package.
>>      
> In off-list email you said
>
>    
>> I used ShortRead package to filter the data and then saved as fastq
>> file. But when I run the qa function again there is error in
>> .local(dirPath, pattern, ...):>  >    ShortRead internal: too many
>> 'snap' entries.
>>      
> It is hard to follow what you are trying to accomplish. Please paste
> short code to illustrate. Use data files from ShortRead, so that your
> code is reproducible by others. Include the output of sessionInfo() so
> that it is clear which version of software you are using. Perhaps after
>
>    example(readFastq)
>
> you do
>
>    
>> rfq
>>      
> class: ShortReadQ
> length: 256 reads; width: 36 cycles
>    
>> file = tempfile() # a file to save output
>> noNrfq = rfq[nFilter()(rfq)]
>> writeFastq(noNrfq, file)
>> qaresult = qa(dirname(file), basename(file), type="fastq")
>>      
> ? But what is the problem? Note also that it is not necessary to write
> the fastq file to disk,
>
>    
>> qa(list(noNrfq=noNrfq))
>>      
> class: ShortReadQQA(9)
> QA elements (access with qa[["elt"]]):
>    readCounts: data.frame(1 3)
>    baseCalls: data.frame(1 5)
>    readQualityScore: data.frame(512 4)
>    baseQuality: data.frame(94 3)
>    alignQuality: data.frame(1 3)
>    frequentSequences: data.frame(50 4)
>    sequenceDistribution: data.frame(3 4)
>    perCycle: list(2)
>      baseCall: data.frame(141 4)
>      quality: data.frame(341 5)
>    perTile: list(2)
>      readCounts: data.frame(0 4)
>      medianReadQualityScore: data.frame(0 4)
>
> This is my sessionInfo()
>
>    
>> sessionInfo()
>>      
> R version 2.10.1 Patched (2010-03-27 r51570)
> x86_64-unknown-linux-gnu
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] ShortRead_1.4.0    lattice_0.18-3     BSgenome_1.14.2
> Biostrings_2.14.12
> [5] IRanges_1.4.16
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.6.1 grid_2.10.1   hwriter_1.2   tools_2.10.1
>
>    
>> Many thanks in advance!
>> Wei
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>      
>    


-- 
Yanwei Tan
Institute of Neurobiology
1.OG, AG Bading
Im Neuenheimer Feld 364
University of Heidelberg
69120 Heidelberg
Germany

Tel:+49-6221-548319
Fax:+49-6221-546700



More information about the Bioc-sig-sequencing mailing list