[Rd] Decompressing raw vectors in memory
Hadley Wickham
hadley at rice.edu
Wed May 2 17:43:22 CEST 2012
>> I'm struggling to decompress a gzip'd raw vector in memory:
>>
>> content<- readBin("http://httpbin.org/gzip", "raw", 1000)
>>
>> memDecompress(content, type = "gzip")
>> # Error in memDecompress(content, type = "gzip") :
>> # internal error -3 in memDecompress(2)
>>
>> I'm reasonably certain that the file is correctly compressed, because
>> if I save it out to a file, I can read the uncompressed data:
>>
>> tmp<- tempfile()
>> writeBin(content, tmp)
>> readLines(tmp)
>>
>> So that suggests I'm using memDecompress incorrectly. Any hints?
>
> Headers.
Looking at http://tools.ietf.org/html/rfc1952:
* the first two bytes are id1 and id2, which are 1f 8b as expected
* the third byte is the compression: deflate (as.integer(content[3]))
* the fourth byte is the flag
rawToBits(content[4])
[1] 00 00 00 00 00 00 00 00
which indicates no extra header fields are present
So the header looks ok to me (with my limited knowledge of gzip)
Stripping off the header doesn't seem to help either:
memDecompress(content[-(1:10)], type = "gzip")
# Error in memDecompress(content[-(1:10)], type = "gzip") :
# internal error -3 in memDecompress(2)
I've read the help for memDecompress but I don't see anything there to help me.
Any more hints?
Thanks!
Hadley
--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/
More information about the R-devel
mailing list