[Rd] suggestion for extending ?as.factor
Martin Maechler
maechler at stat.math.ethz.ch
Mon May 4 10:40:12 CEST 2009
>>>>> "PS" == Petr Savicky <savicky at cs.cas.cz>
>>>>> on Sun, 3 May 2009 22:32:04 +0200 writes:
>>>>> "PS" == Petr Savicky <savicky at cs.cas.cz>
>>>>> on Sun, 3 May 2009 22:32:04 +0200 writes:
PS> In R-2.10.0, the development version, function as.factor() uses 17 digit
PS> precision for conversion of numeric values to character type. This
PS> is very good for the consistency of the resulting factor, however,
PS> i expect that people will complain about, for example, as.factor(0.3)
PS> being
PS> [1] 0.29999999999999999
PS> Levels: 0.29999999999999999
PS> I suggest to extend the "Warning" section of ?as.factor by the following
PS> paragraph.
PS> If as.factor() is used for a numeric vector, then the numbers are
PS> converted to character strings with 17 digit precision using their
PS> machine representation. This guarantees that different numbers are
PS> converted to different levels, but may produce unwanted results, if
PS> the numbers are expected to have limited number of decimal positions.
PS> For example, as.factor(c(0.1, 0.2, 0.3)) produces
PS> [1] 0.10000000000000001 0.20000000000000001 0.29999999999999999
PS> Levels: 0.10000000000000001 0.20000000000000001 0.29999999999999999
PS> In order to avoid this, convert the numbers to a character vector
PS> using formatC() or a similar function before using as.factor().
PS> Petr.
Thank you, Petr, for the good suggestion.
I have added a (shorter) paragraph, though to the 'Details' not the
'Warning' section, and also one to the 'Examples' :
## Converting (non-integer) numbers:
as.factor(c(0.1, 0.2, 0.3)) # maybe not what you'd expect, so rather use
factor(format(c(0.1, 0.2, 0.3)))
Martin Maechler, ETH Zurich
More information about the R-devel
mailing list