[Rd] tar R command
Henrik Bengtsson
hb at biostat.ucsf.edu
Mon Nov 29 05:35:47 CET 2010
First, if you look carefully, then you see that argument 'files'
should specify *filepaths*, i.e. directories and not specific files.
Thus, if you for instance place your files in directory "foo/" and
then call
tar("foo.tar", files="foo/");
you would do the right thing.
HOWEVER, looking at the internals of base::tar(), it seems to be
designed for a non-Windows platform, i.e. it will not work on Windows
as it stands (more below). A workaround that also illustrating the
problems are the following patch(es):
# PATCH for file.info() such that tar() works on Windows
tar <- utils::tar; environment(tar) <- globalenv();
file.info <- function(...) {
fi <- base::file.info(...);
fi[setdiff(c("uid", "gid", "uname", "grname"), names(fi))] <- NA;
fi;
} # file.info()
Example:
dir.create("foo/");
cat(file="foo/foo.txt", rep(letters, times=100));
tar("foo.tar", files="foo/");
str(file.info("foo.tar"));
'data.frame': 1 obs. of 11 variables:
$ size : num 7680
$ isdir : logi FALSE
$ mode :Class 'octmode' int 438
$ mtime : POSIXct, format: "2010-11-28 20:24:05"
$ ctime : POSIXct, format: "2010-11-28 20:03:56"
$ atime : POSIXct, format: "2010-11-28 20:07:40"
$ exe : chr "no"
$ uid : logi NA
$ gid : logi NA
$ uname : logi NA
$ grname: logi NA
This seems to generate a valid foo.tar file.
PROBLEMS:
Here are a few problems I have identified with tar().
PROBLEM #1:
The default for argument files=NULL is documented "to archive all
files under the current directory". In reality it gives:
Error in list.files(files, recursive = TRUE, all.files = TRUE,
full.names = TRUE: invalid 'directory' argument
because list.files(NULL) is invalid. The default should instead be files=".".
PROBLEM #2:
If passing a non-existing path (argument 'files'), then tar()
generates an invalid *.tar file of size 1024 bytes (not empty as OP
say). Better would be to assert that each of the directories
requested really exists and are directories, e.g. using
file.info()$dir.
PROBLEM #3:
tar() assumes that file.info() returns a data.frame with fields 'uid',
'gid' and 'uname'. That is not the case for file.info() on Windows.
> sessionInfo()
R version 2.12.0 Patched (2010-11-24 r53656)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
My $0.20
/Henrik
On Sun, Nov 28, 2010 at 7:00 PM, Dario Strbenac
<D.Strbenac at garvan.org.au> wrote:
> Hello,
>
> The documentation for the tar command leads me to think there is an internal implementation when the command can't be found in the OS.
>
> However, it doesn't seem to be the case, as I get an empty .tar file generated on a small example I made :
>
>> dir(pattern = "jpg")
> [1] "MA56237502_635.jpg"
>> file.info("MA56237502_635.jpg")
> size isdir mode mtime ctime atime exe
> MA56237502_635.jpg 229831 FALSE 666 2010-11-29 13:05:49 2010-11-29 13:00:36 2010-11-29 13:00:36 no
>> tar("example.tar", files = dir(pattern = "jpg"))
>> file.info("example.tar")
> size isdir mode mtime ctime atime exe
> example.tar 1024 FALSE 666 2010-11-29 13:43:29 2010-11-29 13:42:30 2010-11-29 13:42:30 no
>
> Is this an unimplemented feature ?
>
>> sessionInfo()
> R version 2.12.0 (2010-10-15)
> Platform: x86_64-pc-mingw32/x64 (64-bit)
> ... ... ...
>
> Thanks,
> Dario.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
More information about the R-devel
mailing list