[Rd] 1954 from NA
Tomas Kalibera
tom@@@k@||ber@ @end|ng |rom gm@||@com
Mon May 24 12:31:53 CEST 2021
On 5/24/21 11:46 AM, Adrian Dușa wrote:
> On Sun, May 23, 2021 at 10:14 PM Tomas Kalibera
> <tomas.kalibera using gmail.com <mailto:tomas.kalibera using gmail.com>> wrote:
>
> [...]
>
> Good, but unfortunately the delineation between computation and
> non-computation is not always transparent. Even if an operation
> doesn't look like "computation" on the high-level, it may
> internally involve computation - so, really, an R NA can become R
> NaN and vice versa, at any point (this is not a "feature", but it
> is how things are now).
>
>
> I see.
> Well, this is a risk we'll have to consider when the time comes. For
> the moment, storing some metadata within the payload seems to work.
>
>> [...]
>
> Ok, then I would probably keep the meta-data on the missing values
> on the side to implement such missing values in such code, and
> treat them explicitly in supported operations.
>
> But. in principle, you can use the floating-point NaN payloads,
> and you can pass such values to R. You just need to be prepared
> that not only you would loose your payloads/tags, but also the
> difference between R NA and R NaNs. Thanks to value semantics of
> R, you would not loose the tags in input values with proper
> reference counts (e.g. marked immutable), because those values
> will not be modified.
>
> NaNs are fine of course, but then some (social science?) users might
> get confused about the difference between NAs and NaNs, and for this
> reason only I would still like to preserve the 1954 payload.
> If at all possible, however, the extra 16 bits from this payload would
> make a whole lot of a difference.
>
> Please forgive my persistence, but would it be possible to use an
> unsigned short instead of an unsigned int for the 1954 payload?
> That is, if it doesn't break anything, but I don't really see what it
> could. The corresponding check function seems to work just fine and it
> doesn't need to be changed at all:
>
> int R_IsNA(double x)
> {
> if (isnan(x)) {
> ieee_double y;
> y.value = x;
> return (y.word[lw] == 1954);
> }
> return 0;
> }
For the reasons I explained, I would be against such a change. Keeping
the data on the side, as also recommended by others on this list, would
allow you for a reliable implementation. I don't want to support fragile
package code building on unspecified R internals, and in this case
particularly internals that themselves have not stood the test of time,
so are at high risk of change.
Best
Tomas
>
> Best wishes,
> Adrian
>
>
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list