[Rd] readLines() behaves differently for gzfile connection

Michael Lawrence l@wrence@mich@el @ending from gene@com
Mon May 14 17:45:26 CEST 2018


I haven't been able to reproduce the empty lines issue on my Mac or
Linux laptop, but I have yet to try that container.

The warning is because of a SEEK_SET to -1, which apparently is
unsupported by zlib. Maybe the zlib version in that container is
getting confused. I'm not sure why readLines() wants to seek to -1
instead of 0, but it only does that on non-blocking connections. The
compressed file connections are effectively blocking but are marked as
non-blocking. Marking them as blocking removes the warning. I will get
that into devel and release soon. Hopefully that fixes the empty lines
issue also.

Michael

On Thu, May 10, 2018 at 4:21 PM, Ben Heavner <bheavner at gmail.com> wrote:
> You bet - it's available on github at
> https://github.com/UW-GAC/wgsaparsr/blob/master/tests/testthat/1k_annotation.gz
>
> -Ben
>
> On Thu, May 10, 2018 at 4:17 PM, Michael Lawrence
> <lawrence.michael at gene.com> wrote:
>>
>> Would it be possible to get that file or a representative subset of it
>> somewhere so that I can reproduce this?
>>
>> Thanks,
>> Michael
>>
>> On Thu, May 10, 2018 at 3:31 PM, Ben Heavner <bheavner at gmail.com> wrote:
>> > When I read a .gz file with readLines() in 3.4.3, it returns text (and a
>> > warning). In 3.5.0, it gives a warning, but no text. Is this expected
>> > behavior or a bug?
>> >
>> > 3.4.3:
>> >> source_file = "1k_annotation.gz"
>> >> readfile_con <- gzfile(source_file, "r")
>> >> readLines(readfile_con, n = 5)
>> > [1] "#chr\tpos\tref\talt\t
>> >
>> > <truncated output here>
>> >
>> > Warning message:
>> > In readLines(readfile_con, n = 5) :
>> >   seek on a gzfile connection returned an internal error
>> >
>> >> close(readfile_con)
>> >
>> >> sessionInfo()
>> > R version 3.4.3 (2017-11-30)
>> > Platform: x86_64-apple-darwin15.6.0 (64-bit)
>> > Running under: macOS Sierra 10.12.6
>> >
>> > Matrix products: default
>> > BLAS:
>> >
>> > /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
>> > LAPACK:
>> >
>> > /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
>> >
>> > locale:
>> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>> >
>> > attached base packages:
>> > [1] stats     graphics  grDevices utils     datasets  methods   base
>> >
>> > loaded via a namespace (and not attached):
>> > [1] compiler_3.4.3
>> >
>> > ---------------------------------------------
>> >
>> > 3.5.0:
>> >> source_file = "1k_annotation.gz"
>> >> readfile_con <- gzfile(source_file, "r")
>> >> readLines(readfile_con, n = 5)
>> > [1] "" "" "" "" ""
>> > Warning message:
>> > In readLines(readfile_con, n = 5) :
>> >   seek on a gzfile connection returned an internal error
>> >> close(readfile_con)
>> >> sessionInfo()
>> > R version 3.5.0 (2018-04-23)
>> > Platform: x86_64-pc-linux-gnu (64-bit)
>> > Running under: Debian GNU/Linux 9 (stretch)
>> >
>> > Matrix products: default
>> > BLAS: /usr/lib/openblas-base/libblas.so.3
>> > LAPACK: /usr/lib/libopenblasp-r0.2.19.so
>> >
>> > locale:
>> >  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>> >  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>> >  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C
>> >  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>> >  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> >
>> > attached base packages:
>> > [1] stats     graphics  grDevices utils     datasets  methods   base
>> >
>> > loaded via a namespace (and not attached):
>> > [1] compiler_3.5.0
>> >
>> > ----------------------------------------
>> > (note: I'm running 3.5.0 via the docker rocker/tidyverse:3.5 container,
>> > and
>> > 3.4.3 on my mac desktop machine)
>> >
>> > Thanks!
>> > Ben Heavner
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>
>



More information about the R-devel mailing list