[Rd] Recent changes to as.complex(NA_real_)

Sat Sep 23 19:22:35 CEST 2023

It sounds like we need to add arguments (with sensible defaults) to complex(), Re(), Im(), is.na.complex() etc to allow the user to specify the desired behavior.

--  
Change your thoughts and you change the world.
--Dr. Norman Vincent Peale

> On Sep 23, 2023, at 12:37 PM, Mikael Jagan <jaganmn2 using gmail.com> wrote:
> 
> 
> 
> On 2023-09-23 9:43 am, Martin Maechler wrote:
>>>>>>> Hervé Pagès
>>>>>>>     on Fri, 22 Sep 2023 16:55:05 -0700 writes:
>>     > The problem is that you have things that are
>>     > **semantically** different but look exactly the same:
>>     > They look the same:
>>     >> x
>>     > [1] NA
>>     >> y
>>     > [1] NA
>>     >> z
>>     > [1] NA
>>     >> is.na(x)
>>     > [1] TRUE
>>     >> is.na(y)
>>     > [1] TRUE
>>     >> is.na(z)
>>     > [1] TRUE
>>     >> str(x)
>>     >   cplx NA
>>     >> str(y)
>>     >   num NA
>>     >> str(z)
>>     >   cplx NA
>>     > but they are semantically different e.g.
>>     >> Re(x)
>>     > [1] NA
>>     >> Re(y)
>>     > [1] -0.5  # surprise!
>>     >> Im(x)  # surprise!
>>     > [1] 2
>>     >> Im(z)
>>     > [1] NA
>>     > so any expression involving Re() or Im() will produce
>>     > different results on input that look the same on the
>>     > surface.
>>     > You can address this either by normalizing the internal
>>     > representation of complex NA to always be complex(r=NaN,
>>     > i=NA_real_), like for NA_complex_, or by allowing the
>>     > infinite variations that are currently allowed and at the
>>     > same time making sure that both Re() and Im()  always
>>     > return NA_real_ on a complex NA.
>>     > My point is that the behavior of complex NA should be
>>     > predictable. Right now it's not. Once it's predictable
>>     > (with Re() and Im() both returning NA_real_ regardless of
>>     > internal representation), then it no longer matters what
>>     > kind of complex NA is returned by as.complex(NA_real_),
>>     > because they are no onger distinguishable.
>>     > H.
>>     > On 9/22/23 13:43, Duncan Murdoch wrote:
>>     >> Since the result of is.na(x) is the same on each of
>>     >> those, I don't see a problem.  As long as that is
>>     >> consistent, I don't see a problem. You shouldn't be using
>>     >> any other test for NA-ness.  You should never be
>>     >> expecting identical() to treat different types as the
>>     >> same (e.g.  identical(NA, NA_real_) is FALSE, as it
>>     >> should be).  If you are using a different test, that's
>>     >> user error.
>>     >>
>>     >> Duncan Murdoch
>>     >>
>>     >> On 22/09/2023 2:41 p.m., Hervé Pagès wrote:
>>     >>> We could also question the value of having an infinite
>>     >>> number of NA representations in the complex space. For
>>     >>> example all these complex values are displayed the same
>>     >>> way (as NA), are considered NAs by is.na(), but are not
>>     >>> identical or semantically equivalent (from an Re() or
>>     >>> Im() point of view):
>>     >>>
>>     >>>       NA_real_ + 0i
>>     >>>
>>     >>>       complex(r=NA_real_, i=Inf)
>>     >>>
>>     >>>       complex(r=2, i=NA_real_)
>>     >>>
>>     >>>       complex(r=NaN, i=NA_real_)
>>     >>>
>>     >>> In other words, using a single representation for
>>     >>> complex NA (i.e.  complex(r=NA_real_, i=NA_real_)) would
>>     >>> avoid a lot of unnecessary complications and surprises.
>>     >>>
>>     >>> Once you do that, whether as.complex(NA_real_) should
>>     >>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
>>     >>> i=NA_real_) becomes a moot point.
>>     >>>
>>     >>> Best,
>>     >>>
>>     >>> H.
>> Thank you, Hervé.
>> Your proposition is yet another one,
>> to declare that all complex NA's should be treated as identical
>> (almost/fully?) everywhere.
>> This would be a possibility, but I think a drastic one.
>> I think there are too many cases, where I want to keep the
>> information of the real part independent of the values of the
>> imaginary part (e.g. think of the Riemann hypothesis), and
>> typically vice versa.
>> With your proposal, for a (potentially large) vector of complex numbers,
>> after
>>       Re(z)  <-  1/2
>> I could no longer rely on   Re(z) == 1/2,
>> because it would be wrong for those z where (the imaginary part/ the number)
>> was NA/NaN.
>> Also, in a similar case, a
>>       Im(z) <- NA
>> would have to "destroy" all real parts  Re(z);
>> not really typically in memory, but effectively for the user,  Re(z)
>> would be all NA/NaN.
>> And I think there are quite a few other situations
>> where looking at Re() and Im() separately makes a lot of sense.
> 
> Indeed, and there is no way to "tell" BLAS and LAPACK to treat both the real and
> imaginary parts as NA_REAL when either is NA_REAL.  Hence the only reliable way
> to implement such a proposal would be to post-process the result of any
> computation returning a complex type, testing for NA_REAL and setting both parts
> to NA_REAL in that case.  My expectation is that such testing would drastically
> slow down basic arithmetic and algebraic operations ...
> 
> Mikael
> 
>> Spencer also made a remark in this direction.
>> All in all I'd be very reluctant to move in this direction;
>> but yes, I'm just one person ... let's continue musing and
>> considering !
>> Martin
>>     >>> On 9/22/23 03:38, Martin Maechler wrote:
>>     >>>>>>>>> Mikael Jagan       on Thu, 21 Sep 2023 00:47:39
>>     >>>>>>>>> -0400 writes:
>>     >>>>       > Revisiting this thread from April:
>>     >>>>
>>     >>>> >https://stat.ethz.ch/pipermail/r-devel/2023-April/082545.html
>>     >>>>
>>     >>>>       > where the decision (not yet backported) was
>>     >>>> made for       > as.complex(NA_real_) to give
>>     >>>> NA_complex_ instead of       > complex(r=NA_real_,
>>     >>>> i=0), to be consistent with       > help("as.complex")
>>     >>>> and as.complex(NA) and as.complex(NA_integer_).
>>     >>>>
>>     >>>>       > Was any consideration given to the alternative?
>>     >>>>       > That is, to changing as.complex(NA) and
>>     >>>> as.complex(NA_integer_) to       > give
>>     >>>> complex(r=NA_real_, i=0), consistent with       >
>>     >>>> as.complex(NA_real_), then amending help("as.complex")
>>     >>>>       > accordingly?
>>     >>>>
>>     >>>> Hmm, as, from R-core, mostly I was involved, I admit to
>>     >>>> say "no", to my knowledge the (above) alternative
>>     >>>> wasn't considered.
>>     >>>>
>>     >>>>     > The principle that     >
>>     >>>> Im(as.complex(<real=(double|integer|logical)>)) should
>>     >>>> be zero     > is quite fundamental, in my view, hence
>>     >>>> the "new" behaviour     > seems to really violate the
>>     >>>> principle of least surprise ...
>>     >>>>
>>     >>>> of course "least surprise"  is somewhat subjective.
>>     >>>> Still, I clearly agree that the above would be one
>>     >>>> desirable property.
>>     >>>>
>>     >>>> I think that any solution will lead to *some* surprise
>>     >>>> for some cases, I think primarily because there are
>>     >>>> *many* different values z  for which  is.na(z)  is
>>     >>>> true,  and in any case NA_complex_  is only of the
>>     >>>> many.
>>     >>>>
>>     >>>> I also agree with Mikael that we should reconsider the
>>     >>>> issue that was raised by Davis Vaughan here ("on
>>     >>>> R-devel") last April.
>>     >>>>
>>     >>>>       > Another (but maybe weaker) argument is that
>>     >>>>       > double->complex coercions happen more often
>>     >>>> than       > logical->complex and integer->complex
>>     >>>> ones. Changing the       > behaviour of the more
>>     >>>> frequently performed coercion is       > more likely to
>>     >>>> affect code "out there".
>>     >>>>
>>     >>>>       > Yet another argument is that one expects
>>     >>>>
>>     >>>>       >      identical(as.complex(NA_real_), NA_real_ +
>>     >>>> (0+0i))
>>     >>>>
>>     >>>>       > to be TRUE, i.e., that coercing from double to
>>     >>>> complex is       > equivalent to adding a complex
>>     >>>> zero.  The new behaviour       > makes the above FALSE,
>>     >>>> since NA_real_ + (0+0i) gives       >
>>     >>>> complex(r=NA_real_, i=0).
>>     >>>>
>>     >>>> No!  --- To my own surprise (!) --- in current R-devel
>>     >>>> the above is TRUE, and         NA_real_ + (0+0i)  , the
>>     >>>> same as         NA_real_ + 0i      , really gives
>>     >>>> complex(r=NA, i=NA) :
>>     >>>>
>>     >>>> Using showC() from ?complex
>>     >>>>
>>     >>>>     showC <- function(z) noquote(sprintf("(R = %g, I =
>>     >>>> %g)", Re(z), Im(z)))
>>     >>>>
>>     >>>> we see (in R-devel) quite consistently
>>     >>>>
>>     >>>>> showC(NA_real_ + 0i)
>>     >>>> [1] (R = NA, I = NA)
>>     >>>>> showC(NA       + 0i)  # NA is 'logical'
>>     >>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
>>     >>>> "R-patched" -- *in*consistently
>>     >>>>
>>     >>>>> showC(NA_real_ + 0i)
>>     >>>> [1] (R = NA, I = 0)
>>     >>>>> showC(NA + 0i)
>>     >>>> [1] (R = NA, I = NA) .... and honestly, I do not see
>>     >>>> *where* (and when) we changed the underlying code (in
>>     >>>> arithmetic.c !?)  in R-devel to *also* produce
>>     >>>> NA_complex_  in such complex *arithmetic*
>>     >>>>
>>     >>>>
>>     >>>>       > Having said that, one might also (but more
>>     >>>> naively) expect
>>     >>>>
>>     >>>>       >
>>     >>>> identical(as.complex(as.double(NA_complex_)),
>>     >>>> NA_complex_)
>>     >>>>
>>     >>>>       > to be TRUE.
>>     >>>>
>>     >>>> as in current R-devel
>>     >>>>
>>     >>>>       > Under my proposal it continues to be FALSE.
>>     >>>>
>>     >>>> as in "R-release"
>>     >>>>
>>     >>>>       > Well, I'd prefer if it gave FALSE with a
>>     >>>> warning       > "imaginary parts discarded in
>>     >>>> coercion", but it seems that       >
>>     >>>> as.double(complex(r=a, i=b)) never warns when either of
>>     >>>>       > 'a' and 'b' is NA_real_ or NaN, even where
>>     >>>> "information"       > {nonzero 'b'} is clearly lost ...
>>     >>>>
>>     >>>> The question of *warning* here is related indeed, but I
>>     >>>> think we should try to look at it only *secondary* to
>>     >>>> your first proposal.
>>     >>>>
>>     >>>>       > Whatever decision is made about
>>     >>>> as.complex(NA_real_),       > maybe these points should
>>     >>>> be weighed before it becomes part of       > R-release
>>     >>>> ...
>>     >>>>
>>     >>>>       > Mikael
>>     >>>>
>>     >>>> Indeed.
>>     >>>>
>>     >>>> Can we please get other opinions / ideas here?
>>     >>>>
>>     >>>> Thank you in advance for your thoughts!  Martin
>>     >>>>
>>     >>>> ---
>>     >>>>
>>     >>>> PS:
>>     >>>>
>>     >>>>    Our *print()*ing  of complex NA's ("NA" here meaning
>>     >>>> NA or NaN)    is also unsatisfactory, e.g. in the case
>>     >>>> where all entries of a    vector are NA in the sense of
>>     >>>> is.na(.), but their    Re() and Im() are not all NA:
>>     >>>>     showC <- function(z) noquote(sprintf("(R = %g, I =
>>     >>>> %g)", Re(z), Im(z)))     z <- complex(, c(11, NA, NA),
>>     >>>> c(NA, 99, NA))     z     showC(z)
>>     >>>>
>>     >>>> gives
>>     >>>>
>>     >>>>     > z     [1] NA NA NA     > showC(z)     [1] (R =
>>     >>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
>>     >>>>
>>     >>>> but that (printing of complex) *is* another issue, in
>>     >>>> which we have the re-opened bugzilla PR#16752
>>     >>>> ==>https://bugs.r-project.org/show_bug.cgi?id=16752
>>     >>>>
>>     >>>> on which we also worked during the R Sprint in Warwick
>>     >>>> three weeks ago, and where I want to commit changes in
>>     >>>> any case {but think we should change even a bit more
>>     >>>> than we got to during the Sprint}.
>>     >>>>
>>     >>>> ______________________________________________
>>     >>>> R-devel using r-project.org  mailing list
>>     >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>     >>>
>>     >>
>>     > --
>>     > Hervé Pagès
>>     > Bioconductor Core Team hpages.on.github using gmail.com
>> 
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

	[[alternative HTML version deleted]]