[Rd] NA in doc for options(matprod="default")

Mon Feb 17 19:40:37 CET 2020

On 2/17/20 6:18 PM, Serguei Sokol wrote:
> Le 17/02/2020 à 17:50, Tomas Kalibera a écrit :
>> On 2/17/20 5:36 PM, Serguei Sokol wrote:
>>> Hi,
>>>
>>> A colleague of mine has spotted me a passage of the doc ?option 
>>> talking about Inf and NaN check in 'matprod=default' section:
>>> https://stat.ethz.ch/R-manual/R-devel/library/base/html/options.html
>>>
>>> I am wondering if NA should be mentioned too as the check seems to 
>>> include this "value" too. NA being different from Inf and NaN it is 
>>> worth mentioning, isn't it?
>>
>> Yes, NA is handled, too. NA is one of NaN values for the purpose of 
>> this text
> Thanks for clarification. It was not clear for me from the text itself.

>
>> (and it is also implemented that way, see ?NaN).
>  Indeed, the text of ?NaN says "... systems typically have
>      many different NaN values.  One of these is used for the numeric
>      missing value ‘NA’, and ‘is.nan’ is false for that value."

> However, R can return both NA and NaN symbols, e.g.
>
> > mean(c(1, NA))
> [1] NA
> > mean(c(1, NaN))
> [1] NaN
>
> which does not help to understand their relationship. 

> That's why I continue to think that it would be clearer to mention NA 
> explicitly in option(matprod=default). It could be a phrasing like 
> "... ensure correct propagation of Inf and NaN (including NA) ..."

I've intentionally left that out from this part of the text in ?options. 
It is irrelevant to talk about how NA propagates through computation 
because NaNs may become NAs and vice versa (see ?NaN). Intuitively it 
would be nice if NAs were different, if a computation of say a pure 
function would result in NA iff at least one of its inputs was an NA. In 
more complicated situations it would be hard to define what should be 
the correct result, but even in the simpler cases this does not work in 
R anymore. We lost this with architectural changes in CPUs that no 
longer defined the payload of NaNs resulting from elementary floating 
point operations, so we would have to always check explicitly, with a 
lot of effort and additional performance overhead. Also, we could not 
hope for the distinction to work through external code (such as BLAS or 
LAPACK) that is not aware of R's notion of NA.

All of ?options for "matprod" is about propagation of standard floating 
point non-finite values (NaN, Inf) through matrix multiplication. The 
naive 3-loop algorithm with a correct compiler (following the standard 
in implementing floating point operations) is regarded as producing the 
correct results. Some BLAS implementations produce different results due 
to optimizations in code (not the naive 3-loop algorithm) and likely 
aggressive compiler optimizations that violate the standard. R users can 
choose based on their preference, the differences in performance can be 
significant.

Best
Tomas
>
> Best,
> Serguei.
>