[Rd] Decompressing raw vectors in memory
Hadley Wickham
hadley at rice.edu
Wed May 2 18:27:04 CEST 2012
> Well, it seems what you get there depends on the client, but I did
>
> tystie% curl -o foo "http://httpbin.org/gzip"
> tystie% file foo
> foo: gzip compressed data, last modified: Wed May 2 17:06:24 2012, max
> compression
>
> and the final part worried me: I do not know if memDecompress() knows about
> that format. The help page does not claim it can do anything other than
> de-compress the results of memCompress() (although past experience has shown
> that it can in some cases). gzfile() supports a much wider range of
> formats.
Ah, ok. Thanks. Then in that case it's probably just as easy to save
it to a temp file and read that.
con <- file(tmp) # R automatically detects compression
open(con, "rb")
on.exit(close(con), TRUE)
readBin(con, raw(), file.info(tmp)$size * 10)
The only challenge is figuring out what n to give readBin. Is there a
good general strategy for this? Guess based on the file size and then
iterate until result of readBin has length less than n?
n <- file.info(tmp)$size * 2
content <- readBin(con, raw(), n)
n_read <- length(content)
while(n_read == n) {
more <- readBin(con, raw(), n)
content <- c(content, more)
n_read <- length(more)
}
Which is not great style, but there shouldn't be many reads.
Hadley
--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/
More information about the R-devel
mailing list