[Bioc-devel] FastqStreamer error in function context

Thomas Girke thomas.girke at ucr.edu
Fri May 11 05:05:03 CEST 2012


Martin,

There is indeed no problem with those functions, I just had a typo in 
my code. I guess I shouldn't send out bug reports when it is well past 
my bed time. Sorry for the false alarm. 

I love the streaming functionality. It really brings NGS analysis back
to low memory systems, such as laptops or outdated cluster nodes, without
the inconviences of constantly splitting large files. 

Best,

Thomas

On Thu, May 10, 2012 at 05:32:58AM +0000, Martin Morgan wrote:
> On 05/09/2012 09:53 PM, Thomas Girke wrote:
> > When FastqStreamer or FastqSampler are called within another function in
> > combination with a writeFastq step then this usually returns an error.
> > However, the same code runs just fine outside of a function.  Below is
> > an example to reproduce this error.
> 
> Hi Thomas --
> 
> The example below fails because there are 256 records in the file, so 
> for me the 52nd yield() returns length(fq) == 1 and the subset '2' is 
> out of bounds. But maybe there is another example?
> 
> > A small feature request for FastqStreamer would be an option to return
> > the total number of reads stored in a fastq file as well as an option
> > for accessing specific records by passing on an index vector.
> 
> For the first part, after the fact we have
> 
>  > f
> class: FastqStreamer
> file: s_1_sequence.txt
> status: n=5 current=1 added=256 total=256
> 
> with 'total=256' indicating that the streamer iterated over (i.e., the 
> file had) 256 records. This is actually accessible in the reference 
> class using the not-really-public (see the last lines of 
> example(FastqStreamer)) accessor
> 
>  > f$status()
>        n current   added   total
>        5       1     256     256
> 
> which is a named integer vector. Is this what you were looking for?
> 
> I'll give the idea about selecting specific records some thought; I see 
> how it could be useful.
> 
> Martin
> 
> >
> > Best,
> >
> > Thomas
> >
> >
> > Here is an example:
> >
> > library(ShortRead)
> > sp<- SolexaPath(system.file('extdata', package='ShortRead'))
> > fl<- file.path(analysisPath(sp), "s_1_sequence.txt")
> >
> > ## Some function using FastqStreamer
> > test<- function(x=fl) {
> >          f<- FastqStreamer(x, 5)
> >          while (length(fq<- yield(f))) {
> >                  fqsub<- fq[1:2]
> >                  writeFastq(fqsub, "test.fastq", mode="a")
> >          }
> >          close(f)
> > }
> > test(x=fl)
> >
> > Error in .IRanges.checkAndTranslateSingleBracketSubscript(x, i) :
> >    subscript contains NAs or out of bounds indices
> >
> >
> > sessionInfo()
> > R version 2.15.0 alpha (2012-03-05 r58604)
> > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > other attached packages:
> > [1] ShortRead_1.14.3    latticeExtra_0.6-19 RColorBrewer_1.0-5
> > [4] Rsamtools_1.8.4     lattice_0.20-6      Biostrings_2.24.1
> > [7] GenomicRanges_1.8.4 IRanges_1.14.2      BiocGenerics_0.2.0
> >
> > loaded via a namespace (and not attached):
> > [1] Biobase_2.16.0 bitops_1.0-4.1 grid_2.15.0    hwriter_1.3    stats4_2.15.0
> > [6] tools_2.15.0   zlibbioc_1.2.0
> >
> > _______________________________________________
> > Bioc-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 
> 
> -- 
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> 
> Location: M1-B861
> Telephone: 206 667-2793



More information about the Bioc-devel mailing list