[BioC] FastqStreamer

Martin Morgan mtmorgan at fhcrc.org
Fri May 25 15:11:26 CEST 2012


On 05/24/2012 05:19 PM, Marcus Davy wrote:
> Hi,
>
> I have had a look at FastqStreamer to stream in successive subsets of a
> Fastq file.
>
>
> My question is whether you can change the number of records to stream on
> the fly rather than having to stream 'n' records each time.
>
>
> For example, I might want to pull in records corresponding to each Illumina
> tile from the indices fetched within the Fastq header information,

Hi Marcus -- this isn't possible at the moment, but I'm giving this (and 
the ability to pull out specific id's) some thought. Along the lines of 
an IRanges() argument with start and end being the parts of the fastq 
file to retrieve, and with 'yield' returning the next range's worth of data.

Martin

>
> or just fetch a certain tile with a record index range m:n which does not
> nessarily start at m=1 within the Fastq file.
>
>
> sp<- SolexaPath(system.file('extdata', package='ShortRead'))
>
> fl<- file.path(analysisPath(sp), "s_1_sequence.txt")
>
> length(readFastq(f))
>
> [1] 256
>
>
> ## This fails as n is expected to be a constant amount of streamed records
>
> f<- FastqStreamer(fl, c(100, 50, 100, 6))
>
> Error in FastqStreamer(fl, c(100, 50, 100, 6)) :
>
>    'n' must be finite and>= 0
>
>
>
> To fetch a certain tile can you alter the 'added' field position similar to
> 'seek' in perl so you can grab only that index range  of the Fastq file
> without having to go through a while loop?
>
>
> f<- FastqStreamer(fl, 50)
>
> print(f)
>
> class: FastqStreamer
>
> file: s_1_sequence.txt
>
> status: n=50 current=0 added=0 total=0  ##<- I want to change the
> 'current/added fields'
>
>
>
> cheers,
>
>
> Marcus
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioconductor mailing list