[Bioc-sig-seq] ShortRead, feature request (if not a bug report)

Ivan Gregoretti ivangreg at gmail.com
Tue May 17 23:35:48 CEST 2011


Hello ShortRead connoisseurs,

ShortRead::readAligned is very smart because it allows you to load the
content of a large file without decompressing it. For example:

aln <- readAligned("s_1_export.txt.gz", type="SolexaExport")

However, its analogue reading function ShortRead::readFasta in my
system complains about being unable to handle gziped targets

fas <- readFasta("s_1.fa.gz")
Error in .normargInputFilepath(filepath) :
  file "s_1.fa.gz" has unsupported type: gzfile


Currently the solution seems to be:

system("gunzip -f s_1.fa.gz")
fas <- readFasta("s_1.fa")
system("gzip -9f s_1.fa")

but this code is highly inefficient, especially with large files.

Please consider adding the missing functionality just like in readAligned.

In case it is a bug in my ShortRead version, see my session below.

Thank you,

Ivan

> sessionInfo()
R version 2.14.0 Under development (unstable) (2011-04-14 r55450)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
 [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
 [7] LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] annotate_1.31.0      AnnotationDbi_1.15.1 Biobase_2.13.1
[4] ShortRead_1.11.1     Rsamtools_1.5.9      lattice_0.19-26
[7] Biostrings_2.21.1    GenomicRanges_1.5.0  IRanges_1.11.1

loaded via a namespace (and not attached):
[1] DBI_0.2-5     grid_2.14.0   hwriter_1.3   RSQLite_0.9-4 tools_2.14.0
[6] xtable_1.5-6



More information about the Bioc-sig-sequencing mailing list