Formatting in formatC and format (PR#129)
Sun, 28 Feb 1999 09:57:50 +0100

[This turned into a bug report which will go to r-devel, so I have taken it
off r-help.]

Bugs reported here:

(1) formatC's help page need some clarification.

(2) formatC needs to treat modes "double" and "real" as equivalent.

(3) format's help page or (preferably) format needs correction re the
meaning of `digits'

On Sun, 28 Feb 1999, Martin Maechler wrote:

> >>>>> "BDR" == Prof Brian Ripley <> writes:

`should not 5.96 print as 6.0 with 2 dignificant digits?'

>     BDR> S-PLUS has an option nsmall to format to do precisely this, and
>     BDR> there was a question asking how to do it on s-news today.  I have
>     BDR> thought for a while that we need to implement something like this
>     BDR> in R: maybe I had better do it....
> Is it necessary?

Yes, as I want to do things that C's printf does not do, AFAIK.

> In R, we've always been having   formatC(.) as an alternative to format(.)
> allowing more precise formatting specifications.
> However, maybe I didn't understand what yuo were alluding to..

I find formatC useful (if hard to understand) but it does not in my hands
do what I want. I don't think formatC's help page defines precisely what it
does. In particular, it does not say what format=NULL, the default, does.  
So if I try

formatC(6.0, digits=1)
[1] " 6"

I do not get what I expected. It looks as if I need

formatC(6.0, digits=1, format="f")
[1] "6.0"

My guess is that the default is "g" for doubles. (I also find that what 
happens when I give format="g" for an integer and or give the `wrong' mode
is not defined or even inconsistent:

> formatC(as.integer(12345678), mode="real")
Error: `type' must be "real" for this format
> formatC(as.integer(12345678), mode="double")
[1] "1.235e+07"
> formatC(as.integer(12345678), mode="double", format="d")
[1] "12345678"

What are the rules here?)

The help page for formatC references the pre-ANSI C 2nd edition of
Kernighan & Ritchie, but I think the ANSI standard is that same. There is a
reference to leading zeroes, none to trailing zeroes. I believe printf can
be system-dependent (and some of it certainly is: win32 libraries use
n.ddde+nnn and Solaris ones use n.ddde+nn not n.dddenn as your help page
says).  I think, pessimistically, that it is not safe to rely on how the C
runtime library does this.

BTW, there is a trick here:

format(c(5.9, 6.0), digits=2)[-1]
[1] "6.0"

does what Dr Carsten wanted (but see later for the meaning of digits=2).

In any case, 

formatC(6.01, digits=1, format="f")
[1] "6.0"

is not what I am looking for, but (from S-PLUS)

> format(6.0, nsmall=1)
[1] "6.0"
> format(6.01, nsmall=1)
[1] "6.01"

Here is something else I find useful, and I think R is not as documented:
In S-PLUS I get (by default) alignment on the (implied) decimal point:

> format(c(6.11, 13.1), digits=2)
[1] " 6.1" "13  "

whereas in R

> format(c(6.11, 13.1), digits=2)
[1] " 6.1" "13.1"

Now R says (?format)

  digits: how many significant digits are to be used for `numeric x'.

but that is not what happens.  And I can't see how to do this in formatC:

> formatC(c(6.11, 13.1), digits=2, format="fg")
[1] "6.1" " 13"

as the alignment is wrong.  I think in print/format, digits=2 means that
data are rounded so that the smallest (in magnitude) number has two
significant digits, and then the result is converted dropping trailing
zeroes, as in

> format(c(6.0, 13.1), digits=2)
[1] " 6" "13"

(That is what S-PLUS gives too, aargh.)

Another, minor, point: format is generic and can be applied to lists,
for example, whereas formatC cannot.


Brian D. Ripley,        
Professor of Applied Statistics,
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

r-devel mailing list -- Read
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: