[Rd] Comments requested on "changedFiles" function

Duncan Murdoch murdoch.duncan at gmail.com
Thu Sep 5 18:34:47 CEST 2013


On 05/09/2013 12:32 PM, Dr Gregory Jefferis wrote:
> Dear Duncan,
>
> This certainly looks useful. Might you consider adding the ability to
> supply an alternative digest function? Details below.

Thanks, that's a good idea.

Duncan Murdoch
>
> I often use a homemade "make" type function which starts by looking at
> modification times e.g. in a private package
>
> https://github.com/jefferis/nat.utils/blob/master/R/make.r
>
> For some of my work, I use hash functions. However because I typically
> work with many large files I often use a special digest process e.g.
> using the crc checksum embedded in a gzip file directly or hashing only
> the part of a large file that is (almost) certain to change.
>
> Perhaps (code unchecked) along the lines of:
>
> changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
> file.info = NULL,
> 	digest = FALSE, digestfun=NULL, full.names = FALSE, ...)
>
> if(digest){
> 	if(is.null(digestfun)) digestfun=tools::md5sum
> 	else digestfun=match.fun(digestfun)
> 	info <- data.frame(info, digest = digestfun(fullnames))
> }
>
> etc
>
> OR alternatively using only one argument:
>
> changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
> file.info = NULL,
> 	digest = FALSE, full.names = FALSE, ...)
>
> if(is.logical(digest)){
> 	if(digest) digestfun=tools::md5sum
> } else {
> 	# Assume that digest specifies a function that we want to use
> 	digestfun=match.fun(digest)
> 	digest=TRUE
> }
>
> if(digest)
> 	info <- data.frame(info, digest = digestfun(fullnames))
>
> etc
>
> Many thanks,
>
> Greg.
>
> On 4 Sep 2013, at 18:53, Duncan Murdoch wrote:
>
> > In a number of places internal to R, we need to know which files have
> > changed (e.g. after building a vignette).  I've just written a general
> > purpose function "changedFiles" that I'll probably commit to R-devel.
> > Comments on the design (or bug reports) would be appreciated.
> >
> > The source for the function and the Rd page for it are inline below.
> >
> > ----- changedFiles.R:
> > changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
> > file.info = NULL,
> >   md5sum = FALSE, full.names = FALSE, ...) {
> > dosnapshot <- function(args) {
> > fullnames <- do.call(list.files, c(full.names = TRUE, args))
> > names <- do.call(list.files, c(full.names = full.names, args))
> > if (isTRUE(file.info) || (is.character(file.info) &&
> > length(file.info))) {
> >  info <- file.info(fullnames)
> > rownames(info) <- names
> >  if (isTRUE(file.info))
> >      file.info <- c("size", "isdir", "mode", "mtime")
> > } else
> >  info <- data.frame(row.names=names)
> > if (md5sum)
> > info <- data.frame(info, md5sum = tools::md5sum(fullnames))
> > list(info = info, timestamp = timestamp, file.info = file.info,
> > md5sum = md5sum, full.names = full.names, args = args)
>
>
> --
> Gregory Jefferis, PhD                   Tel: 01223 267048
> Division of Neurobiology
> MRC Laboratory of Molecular Biology
> Francis Crick Avenue
> Cambridge Biomedical Campus
> Cambridge, CB2 OQH, UK
>
> http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis
> http://jefferislab.org
> http://flybrain.stanford.edu



More information about the R-devel mailing list