[Rd] Comments requested on "changedFiles" function

Dr Gregory Jefferis jefferis at mrc-lmb.cam.ac.uk
Thu Sep 5 18:32:40 CEST 2013


Dear Duncan,

This certainly looks useful. Might you consider adding the ability to 
supply an alternative digest function? Details below.

I often use a homemade "make" type function which starts by looking at 
modification times e.g. in a private package

https://github.com/jefferis/nat.utils/blob/master/R/make.r

For some of my work, I use hash functions. However because I typically 
work with many large files I often use a special digest process e.g. 
using the crc checksum embedded in a gzip file directly or hashing only 
the part of a large file that is (almost) certain to change.

Perhaps (code unchecked) along the lines of:

changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), 
file.info = NULL,
	digest = FALSE, digestfun=NULL, full.names = FALSE, ...)

if(digest){
	if(is.null(digestfun)) digestfun=tools::md5sum
	else digestfun=match.fun(digestfun)
	info <- data.frame(info, digest = digestfun(fullnames))
}

etc

OR alternatively using only one argument:

changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), 
file.info = NULL,
	digest = FALSE, full.names = FALSE, ...)

if(is.logical(digest)){
	if(digest) digestfun=tools::md5sum
} else {
	# Assume that digest specifies a function that we want to use
	digestfun=match.fun(digest)
	digest=TRUE
}

if(digest)
	info <- data.frame(info, digest = digestfun(fullnames))

etc

Many thanks,

Greg.

On 4 Sep 2013, at 18:53, Duncan Murdoch wrote:

> In a number of places internal to R, we need to know which files have 
> changed (e.g. after building a vignette).  I've just written a general 
> purpose function "changedFiles" that I'll probably commit to R-devel.  
> Comments on the design (or bug reports) would be appreciated.
>
> The source for the function and the Rd page for it are inline below.
>
> ----- changedFiles.R:
> changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), 
> file.info = NULL,
>   md5sum = FALSE, full.names = FALSE, ...) {
> dosnapshot <- function(args) {
> fullnames <- do.call(list.files, c(full.names = TRUE, args))
> names <- do.call(list.files, c(full.names = full.names, args))
> if (isTRUE(file.info) || (is.character(file.info) && 
> length(file.info))) {
>  info <- file.info(fullnames)
> rownames(info) <- names
>  if (isTRUE(file.info))
>      file.info <- c("size", "isdir", "mode", "mtime")
> } else
>  info <- data.frame(row.names=names)
> if (md5sum)
> info <- data.frame(info, md5sum = tools::md5sum(fullnames))
> list(info = info, timestamp = timestamp, file.info = file.info,
> md5sum = md5sum, full.names = full.names, args = args)


--
Gregory Jefferis, PhD                   Tel: 01223 267048
Division of Neurobiology
MRC Laboratory of Molecular Biology
Francis Crick Avenue
Cambridge Biomedical Campus
Cambridge, CB2 OQH, UK

http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis
http://jefferislab.org
http://flybrain.stanford.edu



More information about the R-devel mailing list