[Rd] Issue with seek() on gzipped connections in R-devel

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Sep 23 19:13:02 CEST 2011


On Fri, 23 Sep 2011, Jon Clayden wrote:

> Thanks for the replies. I take the point, although it does seem like a
> substantial regression (on non-Windows platforms).
>
> I like to keep the external dependencies of my packages minimal, but I
> will look into the mmap package - thanks, Jeff, for the tip.
>
> Aside from that, though, what is the alternative to using seek? If I
> want to read something at (original, uncompressed) byte offset 352, as
> here, do I have to read and discard everything that comes before it
> first? That seems inelegant at best...

Or uncompress the file.

>
> Regards,
> Jon
>
>
> On 23 September 2011 16:54, Jeffrey Ryan <jeffrey.ryan at lemnica.com> wrote:
>> seek() in general is a bad idea IMO if you are writing cross-platform code.
>>
>> ?seek
>>
>> Warning:
>>
>>     Use of ‘seek’ on Windows is discouraged.  We have found so many
>>     errors in the Windows implementation of file positioning that
>>     users are advised to use it only at their own risk, and asked not
>>     to waste the R developers' time with bug reports on Windows'
>>     deficiencies.
>>
>> Aside from making me laugh, the above highlights the core reason to not use IMO.
>>
>> For not zipped files, you can try the mmap package.  ?mmap and ?types
>> are good starting points.  Allows for accessing binary data on disk
>> with very simple R-like semantics, and is very fast.  Not as fast as a
>> sequential read... but fast.  At present this is 'little endian' only
>> though, but that describes most of the world today.
>>
>> Best,
>> Jeff
>>
>> On Fri, Sep 23, 2011 at 8:58 AM, Jon Clayden <jon.clayden at gmail.com> wrote:
>>> Dear all,
>>>
>>> In R-devel (2011-09-23 r57050), I'm running into a serious problem
>>> with seek()ing on connections opened with gzfile(). A warning is
>>> generated and the file position does not seek to the requested
>>> location. It doesn't seem to occur all the time - I tried to create a
>>> small example file to illustrate it, but the problem didn't occur.
>>> However, it can be seen with a file I use for testing my packages,
>>> which is available through the URL
>>> <https://github.com/jonclayden/tractor/blob/master/tests/data/nifti/maskedb0_lia.nii.gz?raw=true>:
>>>
>>>> con <- gzfile("~/Downloads/maskedb0_lia.nii.gz","rb")
>>>> seek(con, 352)
>>> [1] 0
>>> Warning message:
>>> In seek.connection(con, 352) :
>>>  seek on a gzfile connection returned an internal error
>>>> seek(con, NA)
>>> [1] 190
>>>
>>> The same commands with the same file work as expected in R 2.13.1, and
>>> have worked over many previous versions of R.
>>>
>>>> con <- gzfile("~/Downloads/maskedb0_lia.nii.gz","rb")
>>>> seek(con, 352)
>>> [1] 0
>>>> seek(con, NA)
>>> [1] 352
>>>
>>> My sessionInfo() output is:
>>>
>>> R Under development (unstable) (2011-09-23 r57050)
>>> Platform: x86_64-apple-darwin11.1.0 (64-bit)
>>>
>>> locale:
>>> [1] en_GB.UTF-8/en_US.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>>>
>>> attached base packages:
>>> [1] splines   stats     graphics  grDevices utils     datasets  methods
>>> [8] base
>>>
>>> other attached packages:
>>> [1] tractor.nt_2.0.1      tractor.session_2.0.3 tractor.utils_2.0.0
>>> [4] tractor.base_2.0.3    reportr_0.2.0
>>>
>>> This seems to occur whether or not R is compiled with
>>> "--with-system-zlib". I see some zlib-related changes mentioned in the
>>> NEWS, but I don't see any indication that this is expected. Could
>>> anyone shed any light on it, please?
>>>
>>> Thanks and all the best,
>>> Jon
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>>
>> --
>> Jeffrey Ryan
>> jeffrey.ryan at lemnica.com
>>
>> www.lemnica.com
>> www.esotericR.com
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-devel mailing list