[Rd] possible internal (un)tar bug
Martin Maechler
m@echler @ending from @t@t@m@th@ethz@ch
Tue May 1 18:45:00 CEST 2018
TLDR: Use gzfile(), not file() .. and you have no problems.
>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>> on Tue, 1 May 2018 16:39:57 +0200 writes:
>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>> on Tue, 1 May 2018 16:14:43 +0200 writes:
>>>>> Gábor Csárdi <csardi.gabor at gmail.com>
>>>>> on Tue, 1 May 2018 12:05:32 +0000 writes:
>>> This is a not too old R-devel on Linux, it already fails
>>> in R 3.4.4, and on macOS as well.
>> and fails in considerably older R versions, too.
>> Basically untar() seems to fail on a connection, but works
>> fine on a plain file name.
> Well, there's an easy workaround: If you want to use a
> connection (instead of a simple filename) with untar() and want
> to use compression (as in the example), you
> can currently do that easily when you ensure the connection is
> a "gzcon" one :
> ##=========> Workaround for now:
> ## Create :
> setwd(tempdir()) ; dir.create("pkg")
> cat("this: that\n", file = file.path("pkg", "DESCRIPTION"))
> tf <- "pkg_1.0.tar.gz"
> tar(tf, "pkg", compression = "gzip", tar = "internal")
> unlink("pkg", recursive = TRUE)
> ## As it is a compressed tar file, use it via a gzcon() connection,
> ## and both cases work fine:
> con <- gzcon(file(tf, open = "rb")) ; (f <- untar(con, list = TRUE))
> ## ~~~~~
> con <- gzcon(file(tf, open = "rb")) ; untar(con, files = f)
> stopifnot(identical(f, "pkg/DESCRIPTION"),
> file.exists(f))
> unlink(c(tf,"pkg"), recursive = TRUE) # clean after me
Actually, much better than gzcon(file(....)) is gzfile(....)
The latter works for all compression types that are supported by
tar(), not just for gzip compression.
In the end, I'd conclude for now that the bug is mostly in the
documentation and the unhelpful error message.
We could try to "fix" your use case by wrapping the connection
by gzcon(.) and that is okay also for uncompressed tar
files. However it fails for the newer compression schemes which
are all supported via gzfile().
I propose to commit the following change :
1) change the documentation of untar() to say that a connection
to a compressed tar file should be created by gzfile().
2) in the case of a connection which gave the "block error",
the error would newly be more helpful, mentioning gzfile().
Currently:
> con <- file(tf, open = "rb"); try( untar(con, list = TRUE) ) ## -> Error
Error in untar2(tarfile, files, list, exdir, restore_times) :
incomplete block: rather use gzfile(.) created connection?
>
Feedback (by anyone) ??
Martin
More information about the R-devel
mailing list