[Rd] sprintf("%d", integer(0)) aborts
William Dunlap
wdunlap at tibco.com
Thu Mar 19 16:20:58 CET 2009
> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> Sent: Thursday, March 19, 2009 3:34 AM
> To: William Dunlap
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] sprintf("%d", integer(0)) aborts
>
> On Wed, 18 Mar 2009, William Dunlap wrote:
>
> > In R's sprintf() if any of the arguments has length 0
> > the function aborts. E.g.,
> >
> > > sprintf("%d", integer(0))
> > Error in sprintf("%d", integer(0)) : zero-length argument
> > > sprintf(character(), integer(0))
> > Error in sprintf(character(), integer(0)) :
> > 'fmt' is not a non-empty character vector
> >
> > This comes up in code like
> > x[nchar(x)==0] <- sprintf("No. %d", seq_along(x)[nchar(x)==0])
> > which works if x contains any empty strings
> > x<-c("One","Two","") # changes "" -> "No. 3"
> > but not if it doesn't
> > x<-c("One","Two","Three") # throws error instead of doing nothing
> >
> > When I wrote S+'s sprintf() I had it act like the binary
> > arithmetic operators, returning a zero long result if any
> > argument were zero long. (Otherwise its result is as long
> > as the longest input.) I think it would be nice if R's
> > sprintf did this also.
> >
> > Currently you must add defensive code (if (any(nchar(x)==0))...)
> > to make functions using sprintf to work in all cases and that
> > muddies up the code and slows things down.
> >
> > Do you think this is a reasonable thing to do? I've attached
> > a possible patch to src/main/sprintf.c makes the examples above
> > return character(0).
>
> Yes. It was deliberate that it works (and is documented) the way it
> is, and I've not previously seen any problematic examples.
I was prompted to suggest the change by a note from Jim Holtman
in yesterday's R-help:
> system.time({
+ x <- sample(50000) # test data
+ x[sample(50000,10000)] <- 'asdfasdf' # characters strings
+ which.num <- grep("^[ 0-9]+$", x) # find numbers
+ # convert to leading 0
+ x[which.num] <- sprintf("%018.0f", as.numeric(x[which.num]))
+ x[-which.num] <- toupper(x[-which.num])
+ })
This code failed when I converted it to a function to run
through sapply because then which.num was often integer(0).
When used in production it would probably work for a long time
before seeing a sample in which which.num was integer(0).
(Of course, it would then silently mess up on the next line,
x[-which.num]<-...)
> But at
> least for the ... args, allowing zero-length arguments seems very
> reasonable. I'm less convinced by zero-length formats, but the rule
> may be easier to explain if we allow them.
Those were my thoughts as well.
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
More information about the R-devel
mailing list