[Rd] 1954 from NA
Mark van der Loo
m@rk@v@nder|oo @end|ng |rom gm@||@com
Sun May 23 16:31:43 CEST 2021
I wrote about this once over here:
http://www.markvanderloo.eu/yaRb/2012/07/08/representation-of-numerical-nas-in-r-and-the-1954-enigma/
-M
Op zo 23 mei 2021 15:33 schreef brodie gaslam via R-devel <
r-devel using r-project.org>:
> I should add, I don't know that you can rely on this
> particular encoding of R's NA. If I were trying to restore
> an NA from some external format, I would just generate an
> R NA via e.g NA_real_ in the R session I'm restoring the
> external data into, and not try to hand assemble one.
>
> Best,
>
> B.
>
>
> On Sunday, May 23, 2021, 9:23:54 AM EDT, brodie gaslam via R-devel <
> r-devel using r-project.org> wrote:
>
>
>
>
>
> This is because the NA in question is NA_real_, which
> is encoded in double precision IEEE-754, which uses
> 64 bits. The "1954" is just part of the NA. The NA
> must also conform to the NaN encoding for double precision
> numbers, which requires that the "beginning" portion of
> the number be "0x7ff0" (well, I think it should be "0x7ff8"
> but that's a different story), as you can see here:
>
> x.word[hw] = 0x7ff0;
> x.word[lw] = 1954;
>
> Both those components are part of the same double precision
> value. They are just accessed this way to make it easy to
> set the high bits (63-32) and the low bits (31-0).
>
> So NA is not just 1954, its 0x7ff0 0000 & 1954 (note I'm
> mixing hex and decimals here).
>
> In IEEE 754 double precision encoding numbers that start
> with 0x7ff are all NaNs. The rest of the number except for
> the first bit which designates "quiet" vs "signaling" NaNs can
> be anything. R has taken advantage of that to designate the
> R NA by setting the lower bits to be 1954.
>
> Note I'm being pretty loose about endianess, etc. here, but
> hopefully this conveys the problem.
>
> In terms of your proposal, I'm not entirely sure what you gain.
> You're still attempting to generate a 64 bit representation
> in the end. If all you need is to encode the fact that there
> was an NA, and restore it later as a 64 bit NA, then you can do
> whatever you want so long as the end result conforms to the
> expected encoding.
>
> In terms of using 'short' here (which again, I don't see the
> need for as you're using it to generate the final 64 bit encoding),
> I see two possible problems. You're adding the dependency that
> short will be 16 bits. We already have the (implicit) assumption
> in R that double is 64 bits, and explicit that int is 32 bits.
> But I think you'd be going a bit on a limb assuming that short
> is 16 bits (not sure). More important, if short is indeed 16 bits,
> I think in:
>
> x.word[hw] = 0x7ff0;
>
> You overflow short.
>
> Best,
>
> B.
>
>
>
> On Sunday, May 23, 2021, 8:56:18 AM EDT, Adrian Dușa <
> dusa.adrian using unibuc.ro> wrote:
>
>
>
>
>
> Dear R devs,
>
> I am probably missing something obvious, but still trying to understand why
> the 1954 from the definition of an NA has to fill 32 bits when it normally
> doesn't need more than 16.
>
> Wouldn't the code below achieve exactly the same thing?
>
> typedef union
> {
> double value;
> unsigned short word[4];
> } ieee_double;
>
>
> #ifdef WORDS_BIGENDIAN
> static CONST int hw = 0;
> static CONST int lw = 3;
> #else /* !WORDS_BIGENDIAN */
> static CONST int hw = 3;
> static CONST int lw = 0;
> #endif /* WORDS_BIGENDIAN */
>
>
> static double R_ValueOfNA(void)
> {
> volatile ieee_double x;
> x.word[hw] = 0x7ff0;
> x.word[lw] = 1954;
> return x.value;
> }
>
> This question has to do with the tagged NA values from package haven, on
> which I want to improve. Every available bit counts, especially if
> multi-byte characters are going to be involved.
>
> Best wishes,
> --
> Adrian Dusa
> University of Bucharest
> Romanian Social Data Archive
> Soseaua Panduri nr. 90-92
> 050663 Bucharest sector 5
> Romania
> https://adriandusa.eu
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list