[Rd] Comments requested on "changedFiles" function
    Dr Gregory Jefferis 
    jefferis at mrc-lmb.cam.ac.uk
       
    Thu Sep  5 18:32:40 CEST 2013
    
    
  
Dear Duncan,
This certainly looks useful. Might you consider adding the ability to 
supply an alternative digest function? Details below.
I often use a homemade "make" type function which starts by looking at 
modification times e.g. in a private package
https://github.com/jefferis/nat.utils/blob/master/R/make.r
For some of my work, I use hash functions. However because I typically 
work with many large files I often use a special digest process e.g. 
using the crc checksum embedded in a gzip file directly or hashing only 
the part of a large file that is (almost) certain to change.
Perhaps (code unchecked) along the lines of:
changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), 
file.info = NULL,
	digest = FALSE, digestfun=NULL, full.names = FALSE, ...)
if(digest){
	if(is.null(digestfun)) digestfun=tools::md5sum
	else digestfun=match.fun(digestfun)
	info <- data.frame(info, digest = digestfun(fullnames))
}
etc
OR alternatively using only one argument:
changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), 
file.info = NULL,
	digest = FALSE, full.names = FALSE, ...)
if(is.logical(digest)){
	if(digest) digestfun=tools::md5sum
} else {
	# Assume that digest specifies a function that we want to use
	digestfun=match.fun(digest)
	digest=TRUE
}
if(digest)
	info <- data.frame(info, digest = digestfun(fullnames))
etc
Many thanks,
Greg.
On 4 Sep 2013, at 18:53, Duncan Murdoch wrote:
> In a number of places internal to R, we need to know which files have 
> changed (e.g. after building a vignette).  I've just written a general 
> purpose function "changedFiles" that I'll probably commit to R-devel.  
> Comments on the design (or bug reports) would be appreciated.
>
> The source for the function and the Rd page for it are inline below.
>
> ----- changedFiles.R:
> changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), 
> file.info = NULL,
>   md5sum = FALSE, full.names = FALSE, ...) {
> dosnapshot <- function(args) {
> fullnames <- do.call(list.files, c(full.names = TRUE, args))
> names <- do.call(list.files, c(full.names = full.names, args))
> if (isTRUE(file.info) || (is.character(file.info) && 
> length(file.info))) {
>  info <- file.info(fullnames)
> rownames(info) <- names
>  if (isTRUE(file.info))
>      file.info <- c("size", "isdir", "mode", "mtime")
> } else
>  info <- data.frame(row.names=names)
> if (md5sum)
> info <- data.frame(info, md5sum = tools::md5sum(fullnames))
> list(info = info, timestamp = timestamp, file.info = file.info,
> md5sum = md5sum, full.names = full.names, args = args)
--
Gregory Jefferis, PhD                   Tel: 01223 267048
Division of Neurobiology
MRC Laboratory of Molecular Biology
Francis Crick Avenue
Cambridge Biomedical Campus
Cambridge, CB2 OQH, UK
http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis
http://jefferislab.org
http://flybrain.stanford.edu
    
    
More information about the R-devel
mailing list