[BioC] FastqStreamer
Martin Morgan
mtmorgan at fhcrc.org
Tue May 29 02:06:04 CEST 2012
On 05/25/2012 02:41 PM, Marcus Davy wrote:
> Hi Martin,
> thanks for looking into this, I think it would enhance FastqStreamers
> flexibility to be able to fetch any specified ranges of a Fastq file.
>
> The IRanges approach is similar to my thoughts, with width by default
> (either constant 'n' or variable length using vector recycling), or
> start, and end indexes selected.
I updated ShortRead 1.15.7 in devel to allow FastqStreamer to accept an
IRanges object and yield() corresponding records in the fastq file; see
?FastqStreamer.
Martin
>
> cheers,
>
> Marcus
>
>
> On Sat, May 26, 2012 at 1:11 AM, Martin Morgan <mtmorgan at fhcrc.org
> <mailto:mtmorgan at fhcrc.org>> wrote:
>
> On 05/24/2012 05:19 PM, Marcus Davy wrote:
>
> Hi,
>
> I have had a look at FastqStreamer to stream in successive
> subsets of a
> Fastq file.
>
>
> My question is whether you can change the number of records to
> stream on
> the fly rather than having to stream 'n' records each time.
>
>
> For example, I might want to pull in records corresponding to
> each Illumina
> tile from the indices fetched within the Fastq header information,
>
>
> Hi Marcus -- this isn't possible at the moment, but I'm giving this
> (and the ability to pull out specific id's) some thought. Along the
> lines of an IRanges() argument with start and end being the parts of
> the fastq file to retrieve, and with 'yield' returning the next
> range's worth of data.
>
> Martin
>
>
> or just fetch a certain tile with a record index range m:n which
> does not
> nessarily start at m=1 within the Fastq file.
>
>
> sp<- SolexaPath(system.file('__extdata', package='ShortRead'))
>
> fl<- file.path(analysisPath(sp), "s_1_sequence.txt")
>
> length(readFastq(f))
>
> [1] 256
>
>
> ## This fails as n is expected to be a constant amount of
> streamed records
>
> f<- FastqStreamer(fl, c(100, 50, 100, 6))
>
> Error in FastqStreamer(fl, c(100, 50, 100, 6)) :
>
> 'n' must be finite and>= 0
>
>
>
> To fetch a certain tile can you alter the 'added' field position
> similar to
> 'seek' in perl so you can grab only that index range of the
> Fastq file
> without having to go through a while loop?
>
>
> f<- FastqStreamer(fl, 50)
>
> print(f)
>
> class: FastqStreamer
>
> file: s_1_sequence.txt
>
> status: n=50 current=0 added=0 total=0 ##<- I want to change the
> 'current/added fields'
>
>
>
> cheers,
>
>
> Marcus
>
> [[alternative HTML version deleted]]
>
> _________________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/__listinfo/bioconductor
> <https://stat.ethz.ch/mailman/listinfo/bioconductor>
> Search the archives:
> http://news.gmane.org/gmane.__science.biology.informatics.__conductor
> <http://news.gmane.org/gmane.science.biology.informatics.conductor>
>
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793 <tel:206%20667-2793>
>
>
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioconductor
mailing list