[Rd] Decompressing raw vectors in memory

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed May 2 18:16:29 CEST 2012


On 02/05/2012 16:43, Hadley Wickham wrote:
>>> I'm struggling to decompress a gzip'd raw vector in memory:
>>>
>>> content<- readBin("http://httpbin.org/gzip", "raw", 1000)
>>>
>>> memDecompress(content, type = "gzip")
>>> # Error in memDecompress(content, type = "gzip") :
>>> #  internal error -3 in memDecompress(2)
>>>
>>> I'm reasonably certain that the file is correctly compressed, because
>>> if I save it out to a file, I can read the uncompressed data:
>>>
>>> tmp<- tempfile()
>>> writeBin(content, tmp)
>>> readLines(tmp)
>>>
>>> So that suggests I'm using memDecompress incorrectly.  Any hints?
>>
>> Headers.
>
> Looking at http://tools.ietf.org/html/rfc1952:
>
> * the first two bytes are id1 and id2, which are 1f 8b as expected
>
> * the third byte is the compression: deflate (as.integer(content[3]))
>
> * the fourth byte is the flag
>
>    rawToBits(content[4])
>    [1] 00 00 00 00 00 00 00 00
>
>    which indicates no extra header fields are present
>
> So the header looks ok to me (with my limited knowledge of gzip)
>
> Stripping off the header doesn't seem to help either:
>
> memDecompress(content[-(1:10)], type = "gzip")
> # Error in memDecompress(content[-(1:10)], type = "gzip") :
> #  internal error -3 in memDecompress(2)
>
> I've read the help for memDecompress but I don't see anything there to help me.
>
> Any more hints?

Well, it seems what you get there depends on the client, but I did

tystie% curl -o foo "http://httpbin.org/gzip"
tystie% file foo
foo: gzip compressed data, last modified: Wed May  2 17:06:24 2012, max 
compression

and the final part worried me: I do not know if memDecompress() knows 
about that format.  The help page does not claim it can do anything 
other than de-compress the results of memCompress() (although past 
experience has shown that it can in some cases).  gzfile() supports a 
much wider range of formats.


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list