[Rd] tools::md5sum(directory) behavior different on Windows vs. Unix
Scott Kostyshak
skostysh at princeton.edu
Sun Sep 29 10:16:10 CEST 2013
On Mon, Sep 9, 2013 at 3:00 AM, Scott Kostyshak <skostysh at princeton.edu> wrote:
> tools::md5sum gives a warning if it receives a directory as an
> argument on Unix but not on Windows.
>
> From what I understand, this happens because in Windows a directory is
> not treated as a file so fopen returns NULL. Then, NA is returned
> without a warning. On Unix, a directory is treated as a file so fopen
> does not return NULL so md5 is run and fails, leading to a warning.
>
> This is a good opportunity for me to understand further (in addition
> to [1] and the many places where OS special cases are mentioned) in
> which cases R tries to behave the same on Windows as on Unix and in
> which cases it allows for differences (in this case, a warning vs. no
> warning). For example, it would be straightforward to create a patch
> that would lead to the same behavior in this case. tools::md5sum could
> either issue a warning for each argument that is a directory or it
> could issue no warning (consistent with file.info). Would either patch
> be considered?
Attached is a patch that gives a warning if an element in the file
argument is not a regular file (e.g. is a directory or does not
exist). In my opinion the advantages of this patch are:
(1) the same warnings are generated on all platforms in the case where
one of the elements is a folder.
(2) a warning is also given if a file does not exist.
Comments?
Scott
>
> Or is this difference encouraged because the concept of a file is
> different on Unix than on Windows?
>
> Scott
>
> [1] http://cran.r-project.org/bin/windows/base/rw-FAQ.html#What-should-I-expect-to-behave-differently-from-the-Unix-version
>
>
> --
> Scott Kostyshak
> Economics PhD Candidate
> Princeton University
-------------- next part --------------
Index: trunk/src/library/tools/R/md5.R
===================================================================
--- trunk/src/library/tools/R/md5.R (revision 64011)
+++ trunk/src/library/tools/R/md5.R (working copy)
@@ -17,7 +17,18 @@
# http://www.r-project.org/Licenses/
md5sum <- function(files)
- structure(.Call(Rmd5, files), names=files)
+{
+ reg_ <- file_test("-f", files)
+ regFiles <- files[reg_]
+ notReg <- files[!reg_]
+ if(!all(reg_))
+ warning("The following are not regular files: ",
+ paste(shQuote(notReg), collapse = " "))
+ names(files) <- files
+ files[!reg_] <- NA
+ files[reg_] <- .Call(Rmd5, regFiles)
+ files
+}
.installMD5sums <- function(pkgDir, outDir = pkgDir)
{
Index: trunk/src/library/tools/man/md5sum.Rd
===================================================================
--- trunk/src/library/tools/man/md5sum.Rd (revision 64011)
+++ trunk/src/library/tools/man/md5sum.Rd (working copy)
@@ -18,7 +18,8 @@
\value{
A character vector of the same length as \code{files}, with names
equal to \code{files}. The elements
- will be \code{NA} for non-existent or unreadable files, otherwise
+ will be \code{NA} for non-existent or unreadable files (in which case
+ a warning will be generated), otherwise
a 32-character string of hexadecimal digits.
On Windows all files are read in binary mode (as the \code{md5sum}
More information about the R-devel
mailing list