[Rd] Please make Pre-3.1 read.csv (type.convert) behavior available
Duncan Murdoch
murdoch.duncan at gmail.com
Sun Apr 27 00:59:54 CEST 2014
On 26/04/2014, 6:40 PM, Tom Kraljevic wrote:
>
> Hi Duncan,
>
>
> This program and output should answer your question regarding java behavior.
>
> Basically the character toHexString() representation is shown to be
> lossless for this
> example (in Java).
>
> Please let me know if there is any way I can help further. I’d love for
> this to work!
> I would be happy to put all this into an R bug report if that is
> convenient for you.
This one has enough attention already that I don't think it will get
lost, so no more bug reports are necessary. Martin Maechler (on another
thread) is describing some changes that should address this. It would
be really helpful if you tested it on your examples after he commits his
changes.
Duncan Murdoch
>
>
> Thanks,
> Tom
>
>
>
>
> $ cat example.java
> class example {
> public static void main(String[] args) {
> String value_as_string = "-0x1.fff831c7ffffdp-1";
> double value = Double.parseDouble(value_as_string);
> System.out.println("Starting string : " + value_as_string);
> System.out.println("value toString() : " +
> Double.toString(value));
> System.out.println("value toHexString(): " +
> Double.toHexString(value));
>
> long bits = Double.doubleToRawLongBits(value);
> boolean isNegative = (bits & 0x8000000000000000L) != 0;
> long biased_exponent = (bits & 0x7ff0000000000000L) >> 52;
> long exponent = biased_exponent - 1023;
> long mantissa = bits & 0x000fffffffffffffL;
> System.out.println("isNegative : " + isNegative);
> System.out.println("biased exponent : " + biased_exponent);
> System.out.println("exponent : " + exponent);
> System.out.println("mantissa : " + mantissa);
> System.out.println("mantissa as hex : " +
> Long.toHexString(mantissa));
> }
> }
>
>
> $ javac example.java
> $ java example
> Starting string : -0x1.fff831c7ffffdp-1
> value toString() : -0.999940448440611
> value toHexString(): -0x1.fff831c7ffffdp-1
> isNegative : true
> biased exponent : 1022
> exponent : -1
> mantissa : 4503063234609149
> mantissa as hex : fff831c7ffffd
>
>
> $ java -version
> java version "1.7.0_51"
> Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
>
>
>
> On Apr 26, 2014, at 2:18 PM, Duncan Murdoch <murdoch.duncan at gmail.com
> <mailto:murdoch.duncan at gmail.com>> wrote:
>
>> On 26/04/2014, 4:12 PM, Tom Kraljevic wrote:
>>>
>>> Hi,
>>>
>>>
>>> One additional follow-up here.
>>>
>>> Unfortunately, I hit what looks like an R parsing bug that makes the
>>> Java Double.toHexString() output
>>> unreliable for reading by R. (This is really unfortunate, because
>>> the format is intended to be lossless
>>> and it looks like it’s so close to fully working.)
>>>
>>> You can see the spec for the conversion here:
>>> http://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#toHexString(double)
>>> <http://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#toHexString%28double%29>
>>>
>>> The last value in the list below is not parsed by R in the way I
>>> expected, and causes the column to flip
>>> from numeric to factor.
>>>
>>>
>>> -0x1.8ff831c7ffffdp-1
>>> -0x1.aff831c7ffffdp-1
>>> -0x1.bff831c7ffffdp-1
>>> -0x1.cff831c7ffffdp-1
>>> -0x1.dff831c7ffffdp-1
>>> -0x1.eff831c7ffffdp-1
>>> -0x1.fff831c7ffffdp-1 <<<<< this value is not parsed as a
>>> number and flips the column from numeric to factor.
>>
>> That looks like a bug in the conversion code. It uses the same test
>> for lack of accuracy for hex doubles as it uses for decimal ones, but
>> hex doubles can be larger before they lose precision. I believe the
>> largest integer that can be represented exactly is 2^53 - 1, i.e.
>>
>> 0x1.fffffffffffffp52
>>
>> in this notation; can you confirm that your Java code reads it and
>> writes the same string? This is about 1% bigger than the limit at
>> which type.convert switches to strings or factors.
>>
>> Duncan Murdoch
>
More information about the R-devel
mailing list