[Rd] Comments requested on "changedFiles" function
Dr Gregory Jefferis
jefferis at mrc-lmb.cam.ac.uk
Thu Sep 5 18:32:40 CEST 2013
Dear Duncan,
This certainly looks useful. Might you consider adding the ability to
supply an alternative digest function? Details below.
I often use a homemade "make" type function which starts by looking at
modification times e.g. in a private package
https://github.com/jefferis/nat.utils/blob/master/R/make.r
For some of my work, I use hash functions. However because I typically
work with many large files I often use a special digest process e.g.
using the crc checksum embedded in a gzip file directly or hashing only
the part of a large file that is (almost) certain to change.
Perhaps (code unchecked) along the lines of:
changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
file.info = NULL,
digest = FALSE, digestfun=NULL, full.names = FALSE, ...)
if(digest){
if(is.null(digestfun)) digestfun=tools::md5sum
else digestfun=match.fun(digestfun)
info <- data.frame(info, digest = digestfun(fullnames))
}
etc
OR alternatively using only one argument:
changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
file.info = NULL,
digest = FALSE, full.names = FALSE, ...)
if(is.logical(digest)){
if(digest) digestfun=tools::md5sum
} else {
# Assume that digest specifies a function that we want to use
digestfun=match.fun(digest)
digest=TRUE
}
if(digest)
info <- data.frame(info, digest = digestfun(fullnames))
etc
Many thanks,
Greg.
On 4 Sep 2013, at 18:53, Duncan Murdoch wrote:
> In a number of places internal to R, we need to know which files have
> changed (e.g. after building a vignette). I've just written a general
> purpose function "changedFiles" that I'll probably commit to R-devel.
> Comments on the design (or bug reports) would be appreciated.
>
> The source for the function and the Rd page for it are inline below.
>
> ----- changedFiles.R:
> changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
> file.info = NULL,
> md5sum = FALSE, full.names = FALSE, ...) {
> dosnapshot <- function(args) {
> fullnames <- do.call(list.files, c(full.names = TRUE, args))
> names <- do.call(list.files, c(full.names = full.names, args))
> if (isTRUE(file.info) || (is.character(file.info) &&
> length(file.info))) {
> info <- file.info(fullnames)
> rownames(info) <- names
> if (isTRUE(file.info))
> file.info <- c("size", "isdir", "mode", "mtime")
> } else
> info <- data.frame(row.names=names)
> if (md5sum)
> info <- data.frame(info, md5sum = tools::md5sum(fullnames))
> list(info = info, timestamp = timestamp, file.info = file.info,
> md5sum = md5sum, full.names = full.names, args = args)
--
Gregory Jefferis, PhD Tel: 01223 267048
Division of Neurobiology
MRC Laboratory of Molecular Biology
Francis Crick Avenue
Cambridge Biomedical Campus
Cambridge, CB2 OQH, UK
http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis
http://jefferislab.org
http://flybrain.stanford.edu
More information about the R-devel
mailing list