[Rd] Possible NA Propagation Failure in RISC-V64 CPU?

Jane He @|y@oh4 @end|ng |rom uc|@edu
Fri Feb 24 00:39:08 CET 2023


Hi all,

I am currently compiling R to RISC-V64 CPU and I think I have discovered a
NA propagation failure.

How R implements NA (not available) and NaN (not-a-number) is explained in
detail here:
https://stat.ethz.ch/pipermail/r-devel/2014-February/068380.html.

In short, according to my understanding of R's convention, any calculation
involving NA but no NaN should result in NA (called NA propagation), and
any calculation involving NaN but no NA should result in NaN. Calculations
involving both NA and NaN can result in either value.

While many R functions handle this logic in their source codes, basic
arithmetic operations such as +-/* throw it to the hardware to handle.
However, the RISC-V64 CPU does not behave as expected, at least the CPU I
am using (Starfive JH7100-7110).

Here are the relevant bit patterns. From my understanding, as IEEE only
regulates the bit patterns of NaN, R picks one of the bit patterns (ending
with 07a2) and declares it as NA.

# print_hex is a function to print the bit pattern in hex
> print_hex(0.1)
3fb999999999999a # same for RISC-V
> print_hex(NaN)
7ff8000000000000 # same for RISC-V
> print_hex(NA)
7ff00000000007a2 # same for RISC-V
> print_hex(NA+1)
7ff80000000007a2 # 7ff8000000000000 in RISC-V
> print_hex(NA*1)
7ff80000000007a2 # 7ff8000000000000 in RISC-V
> print_hex(NaN*1)
7ff8000000000000 # same for RISC-V


Therefore, in RISC-V64, all basic arithmetic operations involving NA give
NaN.

> NA+1
[1] NaN

This failure in NA propagation may cause many R packages like mice to not
work properly, and results in the `make check` test in the `stats` package
to fail. Example from the make check test:

xn <- 1:4
yn <- c(1,NA,3:4)
xout <- (1:9)/2
data.frame(approx(xn,yn, xout, na.rm=FALSE, rule = 2)) # failure, some
values should be NA but it turns out NaN

I am reaching out to the R community looking for help in solving this
problem. Does anyone around here have any hints or ideas on how to solve
this issue?

Currently, my hacky implementation is to stop the NA operand before it goes
to the hardware and directly return NA as the output. However, this
solution may penalize performance significantly, so I am looking for any
alternative idea.

Thank you for your time and consideration!

Best regards,
Jane He

_________
University of California, Irvine
Student of Master of Software Engineering
2022-2023 cohort

	[[alternative HTML version deleted]]



More information about the R-devel mailing list