[Rd] Issue with NA value and Octave compatibility
David Bateman
adb014 at gmail.com
Sun Jun 8 20:25:05 CEST 2008
Dear R developers,
I'm an Octave developer in the process of implementing a single
precision type in Octave and I have an issue with the NA value. The
choice of NA value in Octave was made a few years back so that the high
word of the NA value was 0x7ff00000 and the low word was 0x000007A2 for
compatibility with R and to ease any possible issue with the exchange of
data files between Octave and R.
However, now that I'm in the process of implementing the single
precision type I have a problem with this choice for the NA value as the
above when cast to a float results in the loss of the 0x7A2 value
creating a positive Infinity in IEE754, and so conversion of the NA
values between double and float with the above value does not work.
I have several possible choices of how to treat this, but as the reason
for the choice of Octave's NA value was made for compatibility with R,
the choice I'll make might very well be determined by how the R
developers react to any changes that Octave makes in this direction.
I can't realistically wrap the double and float types in Octave and
overload the assignment operators to handle the assignment of a float to
a double and visa versa as this would completely replace the underlying
data types in Octave. Its also impossible to trap everywhere where a
double might be assigned to a float and special case NA values as there
are just too many places that might occur.
I'm therefore assuming that I have to replace Octave's internal
representation of the NA value to allow easy conversion of the NA value
between double and floats. This can be done by replacing the NA value
with one that has zeros in the lower 19 bits of the mantissa of the
double, so that the cast from a double to float and visa versa works
correctly. For example using 0x7FF840F4 and 0x40000000 for the low and
high word of the double NA value. and 0x7FC207A2 for the float NA value
works. However then I have an issue of exchange of NA values with R and
with older versions of Octave.
Its easy enough to check for old NA values in files when reading and
alter them to the new NA values. So forward compatibility with older
versions of Octave and from R to Octave would be ok. However, the
reverse is not true. Actually saving the with the old NA value is
equally possible and would allow full compatible with older versions of
Octave and with R. The downside is that there are many places I'd have
to make a copy of the data when saving to allow this (for example saving
to HDF files), and so I'd prefer not to have to do this if possible.
As backwards compatibility is the smaller concern and self correcting
with time, if R was to also accept the an additional possible NA value
such as 0x7FF840F4/0x40000000, at least when loading files then
compatibility of the NA value between Octave and R could be maintained
and I wouldn't have to pay the penalty of making a copy of the data to
treat the NA values. Would the R developers be willing to make such a
change in R? If not I will maintain the R compatible NA value in
Octave's output and pay the performance penalty within Octave for this.
In any case if R intends at some point to support a single precision
type you will come across the same issue.
Regards
David
More information about the R-devel
mailing list