[Rd] type.convert and doubles
Robert.McGehee at geodecapital.com
Thu Apr 17 15:42:20 CEST 2014
As Greg suggested, this new feature in type.convert certainly did surprise one user (me), enough so that I had to downgrade back to 3.0.3 until our code was modified to handle the new behavior.
Here's my use case: I have a function that pulls arbitrary financial data from a web service call such as a stock's industry, price, volume, etc. by reading the web output as a text table. The data may be either character (industry, stock name, etc.) or numeric (price, volume, etc.), and the function generally doesn't know the class in advance. The problem is that we frequently get numeric values represented with more precision than actually exists, for instance a price of "2.6999999999999999" rather than "2.70". The numeric representation is exactly one digit too much for type.convert which (in R 3.10.0) converts it to character instead of numeric (not what I want). This caused a bunch of "non-numeric argument to binary operator" errors to appear today as numeric data was now being represented as characters.
I have no doubt that this probably will cause some unwanted RODBC side effects for us as well. IMO, getting the class right is more important than infinite precision. What use is a character representation of a number anyway if you can't perform arithmetic on it? I would favor at least making the new behavior optional, but I think many packages (like RODBC) potentially need to be patched to code around the new feature if it's left in.
(This aside, thanks for all the nice features and bug fixes in the new version!)
From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf Of Paul Gilbert
Sent: Friday, April 11, 2014 5:38 PM
To: Simon Urbanek; Gregory R. Warnes
Subject: Re: [Rd] type.convert and doubles
On 04/11/2014 01:43 PM, Simon Urbanek wrote:
> On Apr 11, 2014, at 11:50 AM, Gregory R. Warnes <greg at warnes.net>
>> Hi All,
>> I see this in the NEWS for R 3.1.0:
>> type.convert() (and hence by default read.table()) returns a
>> character vector or factor when representing a numeric input as a
>> double would lose accuracy. Similarly for complex inputs.
>> This behavior seems likely to surprise users.
> Can you elaborate why that would be surprising? It is consistent with
> the intention of type.convert() to determine the correct type to
> represent the value - it has always used character/factor as a
> fallback where native type doesn't match.
Strictly speaking, I don't think this is true. If it were, it would not
have been necessary to make the change so that it does now fallback to
using character/factor. It may, however, have always been the intent.
I don't really think a warning is necessary, but there are some surprises:
> str(type.convert(format(1/3, digits=17))) # R-3.0.3
> str(type.convert(format(1/3, digits=17))) # R-3.1.0
Factor w/ 1 level "0.33333333333333331": 1
Now you could say that one should never do that, and the change is just
flushing out a bug that was always there. But the point is that in
serialization situations there can be some surprises. So, for example,
RODBC talking to PostgresSQL databases is now returning factors rather
than numerics for double precision fields, whereas with RPostgresSQL the
behaviour has not changed.
It has never issued any
> warning in that case historically, so IMHO it would be rather
> surprising if it did now...
> Cheers, Simon
>> Would it be possible to issue a warning when this occurs?
>> Aside: I'm very happy to see the new 's' and 'f' browser (debugger)
>> -Greg [[alternative HTML version deleted]]
>> R-devel at r-project.org mailing list
> ______________________________________________ R-devel at r-project.org
> mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
R-devel at r-project.org mailing list
More information about the R-devel