[Bioc-devel] Append/combine option for filterFastq and similar?
Martin Morgan
mtmorgan at fredhutch.org
Wed Apr 22 19:40:30 CEST 2015
On 04/22/2015 10:28 AM, Jim Hester wrote:
> I typically use pipe() in these circumstances which avoids using any
> additional storage
>
> readLines(pipe("cat file1 file2"))
>
> It should work with filterFastq assuming it can read from connections
> rather than just files, but I have not tested it to be sure.
these solutions don't work on windows or with compressed files (though zcat
*fastq | gzip out.fastq.gz would, I guess) and don't filter reads (I guess
that's what Ryan means by 'duplicating storage', i.e., concatenate then filter
in two separate steps).
filterFastq is expecting character vectors for file names, rather than
connections (at least for input), but to accept connections is I think straight
forward (the underlying FastqStreamer works on connections) so I'll update that...
I think filterFastq should be at relative efficient in both space and time,
though obviously cat and friends are highly optimized and use minimal memory.
Martin
>
> On Wed, Apr 22, 2015 at 1:16 PM, Ryan C. Thompson <rct at thompsonclan.org>
> wrote:
>
>> That's not ideal because it's duplicating storage unnecessarily
>>
>>
>> On 04/22/2015 04:07 AM, Aedin wrote:
>>
>>> This is one instance were a system or simple unix command is very easy
>>>
>>> system('cat *.fastq > all.fastq')
>>>
>>>
>>> ---
>>>
>>> On Apr 22, 2015, at 6:00, bioc-devel-request at r-project.org wrote:
>>>>
>>>> Re: Append/combine option for filterFastq and similar?
>>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioc-devel
mailing list