[Rd] Recent changes to as.complex(NA_real_)
Mikael Jagan
j@g@nmn2 @end|ng |rom gm@||@com
Sat Sep 23 18:36:55 CEST 2023
On 2023-09-23 9:43 am, Martin Maechler wrote:
>>>>>> Hervé Pagès
>>>>>> on Fri, 22 Sep 2023 16:55:05 -0700 writes:
>
> > The problem is that you have things that are
> > **semantically** different but look exactly the same:
>
> > They look the same:
>
> >> x
> > [1] NA
> >> y
> > [1] NA
> >> z
> > [1] NA
>
> >> is.na(x)
> > [1] TRUE
> >> is.na(y)
> > [1] TRUE
> >> is.na(z)
> > [1] TRUE
>
> >> str(x)
> > cplx NA
> >> str(y)
> > num NA
> >> str(z)
> > cplx NA
>
> > but they are semantically different e.g.
>
> >> Re(x)
> > [1] NA
> >> Re(y)
> > [1] -0.5 # surprise!
>
> >> Im(x) # surprise!
> > [1] 2
> >> Im(z)
> > [1] NA
>
> > so any expression involving Re() or Im() will produce
> > different results on input that look the same on the
> > surface.
>
> > You can address this either by normalizing the internal
> > representation of complex NA to always be complex(r=NaN,
> > i=NA_real_), like for NA_complex_, or by allowing the
> > infinite variations that are currently allowed and at the
> > same time making sure that both Re() and Im() always
> > return NA_real_ on a complex NA.
>
> > My point is that the behavior of complex NA should be
> > predictable. Right now it's not. Once it's predictable
> > (with Re() and Im() both returning NA_real_ regardless of
> > internal representation), then it no longer matters what
> > kind of complex NA is returned by as.complex(NA_real_),
> > because they are no onger distinguishable.
>
> > H.
>
> > On 9/22/23 13:43, Duncan Murdoch wrote:
> >> Since the result of is.na(x) is the same on each of
> >> those, I don't see a problem. As long as that is
> >> consistent, I don't see a problem. You shouldn't be using
> >> any other test for NA-ness. You should never be
> >> expecting identical() to treat different types as the
> >> same (e.g. identical(NA, NA_real_) is FALSE, as it
> >> should be). If you are using a different test, that's
> >> user error.
> >>
> >> Duncan Murdoch
> >>
> >> On 22/09/2023 2:41 p.m., Hervé Pagès wrote:
> >>> We could also question the value of having an infinite
> >>> number of NA representations in the complex space. For
> >>> example all these complex values are displayed the same
> >>> way (as NA), are considered NAs by is.na(), but are not
> >>> identical or semantically equivalent (from an Re() or
> >>> Im() point of view):
> >>>
> >>> NA_real_ + 0i
> >>>
> >>> complex(r=NA_real_, i=Inf)
> >>>
> >>> complex(r=2, i=NA_real_)
> >>>
> >>> complex(r=NaN, i=NA_real_)
> >>>
> >>> In other words, using a single representation for
> >>> complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would
> >>> avoid a lot of unnecessary complications and surprises.
> >>>
> >>> Once you do that, whether as.complex(NA_real_) should
> >>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
> >>> i=NA_real_) becomes a moot point.
> >>>
> >>> Best,
> >>>
> >>> H.
>
> Thank you, Hervé.
> Your proposition is yet another one,
> to declare that all complex NA's should be treated as identical
> (almost/fully?) everywhere.
>
> This would be a possibility, but I think a drastic one.
>
> I think there are too many cases, where I want to keep the
> information of the real part independent of the values of the
> imaginary part (e.g. think of the Riemann hypothesis), and
> typically vice versa.
>
> With your proposal, for a (potentially large) vector of complex numbers,
> after
> Re(z) <- 1/2
>
> I could no longer rely on Re(z) == 1/2,
> because it would be wrong for those z where (the imaginary part/ the number)
> was NA/NaN.
> Also, in a similar case, a
>
> Im(z) <- NA
>
> would have to "destroy" all real parts Re(z);
> not really typically in memory, but effectively for the user, Re(z)
> would be all NA/NaN.
>
> And I think there are quite a few other situations
> where looking at Re() and Im() separately makes a lot of sense.
Indeed, and there is no way to "tell" BLAS and LAPACK to treat both the real and
imaginary parts as NA_REAL when either is NA_REAL. Hence the only reliable way
to implement such a proposal would be to post-process the result of any
computation returning a complex type, testing for NA_REAL and setting both parts
to NA_REAL in that case. My expectation is that such testing would drastically
slow down basic arithmetic and algebraic operations ...
Mikael
>
> Spencer also made a remark in this direction.
>
> All in all I'd be very reluctant to move in this direction;
> but yes, I'm just one person ... let's continue musing and
> considering !
>
> Martin
>
> >>> On 9/22/23 03:38, Martin Maechler wrote:
> >>>>>>>>> Mikael Jagan on Thu, 21 Sep 2023 00:47:39
> >>>>>>>>> -0400 writes:
> >>>> > Revisiting this thread from April:
> >>>>
> >>>> >https://stat.ethz.ch/pipermail/r-devel/2023-April/082545.html
> >>>>
> >>>> > where the decision (not yet backported) was
> >>>> made for > as.complex(NA_real_) to give
> >>>> NA_complex_ instead of > complex(r=NA_real_,
> >>>> i=0), to be consistent with > help("as.complex")
> >>>> and as.complex(NA) and as.complex(NA_integer_).
> >>>>
> >>>> > Was any consideration given to the alternative?
> >>>> > That is, to changing as.complex(NA) and
> >>>> as.complex(NA_integer_) to > give
> >>>> complex(r=NA_real_, i=0), consistent with >
> >>>> as.complex(NA_real_), then amending help("as.complex")
> >>>> > accordingly?
> >>>>
> >>>> Hmm, as, from R-core, mostly I was involved, I admit to
> >>>> say "no", to my knowledge the (above) alternative
> >>>> wasn't considered.
> >>>>
> >>>> > The principle that >
> >>>> Im(as.complex(<real=(double|integer|logical)>)) should
> >>>> be zero > is quite fundamental, in my view, hence
> >>>> the "new" behaviour > seems to really violate the
> >>>> principle of least surprise ...
> >>>>
> >>>> of course "least surprise" is somewhat subjective.
> >>>> Still, I clearly agree that the above would be one
> >>>> desirable property.
> >>>>
> >>>> I think that any solution will lead to *some* surprise
> >>>> for some cases, I think primarily because there are
> >>>> *many* different values z for which is.na(z) is
> >>>> true, and in any case NA_complex_ is only of the
> >>>> many.
> >>>>
> >>>> I also agree with Mikael that we should reconsider the
> >>>> issue that was raised by Davis Vaughan here ("on
> >>>> R-devel") last April.
> >>>>
> >>>> > Another (but maybe weaker) argument is that
> >>>> > double->complex coercions happen more often
> >>>> than > logical->complex and integer->complex
> >>>> ones. Changing the > behaviour of the more
> >>>> frequently performed coercion is > more likely to
> >>>> affect code "out there".
> >>>>
> >>>> > Yet another argument is that one expects
> >>>>
> >>>> > identical(as.complex(NA_real_), NA_real_ +
> >>>> (0+0i))
> >>>>
> >>>> > to be TRUE, i.e., that coercing from double to
> >>>> complex is > equivalent to adding a complex
> >>>> zero. The new behaviour > makes the above FALSE,
> >>>> since NA_real_ + (0+0i) gives >
> >>>> complex(r=NA_real_, i=0).
> >>>>
> >>>> No! --- To my own surprise (!) --- in current R-devel
> >>>> the above is TRUE, and NA_real_ + (0+0i) , the
> >>>> same as NA_real_ + 0i , really gives
> >>>> complex(r=NA, i=NA) :
> >>>>
> >>>> Using showC() from ?complex
> >>>>
> >>>> showC <- function(z) noquote(sprintf("(R = %g, I =
> >>>> %g)", Re(z), Im(z)))
> >>>>
> >>>> we see (in R-devel) quite consistently
> >>>>
> >>>>> showC(NA_real_ + 0i)
> >>>> [1] (R = NA, I = NA)
> >>>>> showC(NA + 0i) # NA is 'logical'
> >>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
> >>>> "R-patched" -- *in*consistently
> >>>>
> >>>>> showC(NA_real_ + 0i)
> >>>> [1] (R = NA, I = 0)
> >>>>> showC(NA + 0i)
> >>>> [1] (R = NA, I = NA) .... and honestly, I do not see
> >>>> *where* (and when) we changed the underlying code (in
> >>>> arithmetic.c !?) in R-devel to *also* produce
> >>>> NA_complex_ in such complex *arithmetic*
> >>>>
> >>>>
> >>>> > Having said that, one might also (but more
> >>>> naively) expect
> >>>>
> >>>> >
> >>>> identical(as.complex(as.double(NA_complex_)),
> >>>> NA_complex_)
> >>>>
> >>>> > to be TRUE.
> >>>>
> >>>> as in current R-devel
> >>>>
> >>>> > Under my proposal it continues to be FALSE.
> >>>>
> >>>> as in "R-release"
> >>>>
> >>>> > Well, I'd prefer if it gave FALSE with a
> >>>> warning > "imaginary parts discarded in
> >>>> coercion", but it seems that >
> >>>> as.double(complex(r=a, i=b)) never warns when either of
> >>>> > 'a' and 'b' is NA_real_ or NaN, even where
> >>>> "information" > {nonzero 'b'} is clearly lost ...
> >>>>
> >>>> The question of *warning* here is related indeed, but I
> >>>> think we should try to look at it only *secondary* to
> >>>> your first proposal.
> >>>>
> >>>> > Whatever decision is made about
> >>>> as.complex(NA_real_), > maybe these points should
> >>>> be weighed before it becomes part of > R-release
> >>>> ...
> >>>>
> >>>> > Mikael
> >>>>
> >>>> Indeed.
> >>>>
> >>>> Can we please get other opinions / ideas here?
> >>>>
> >>>> Thank you in advance for your thoughts! Martin
> >>>>
> >>>> ---
> >>>>
> >>>> PS:
> >>>>
> >>>> Our *print()*ing of complex NA's ("NA" here meaning
> >>>> NA or NaN) is also unsatisfactory, e.g. in the case
> >>>> where all entries of a vector are NA in the sense of
> >>>> is.na(.), but their Re() and Im() are not all NA:
> >>>> showC <- function(z) noquote(sprintf("(R = %g, I =
> >>>> %g)", Re(z), Im(z))) z <- complex(, c(11, NA, NA),
> >>>> c(NA, 99, NA)) z showC(z)
> >>>>
> >>>> gives
> >>>>
> >>>> > z [1] NA NA NA > showC(z) [1] (R =
> >>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
> >>>>
> >>>> but that (printing of complex) *is* another issue, in
> >>>> which we have the re-opened bugzilla PR#16752
> >>>> ==>https://bugs.r-project.org/show_bug.cgi?id=16752
> >>>>
> >>>> on which we also worked during the R Sprint in Warwick
> >>>> three weeks ago, and where I want to commit changes in
> >>>> any case {but think we should change even a bit more
> >>>> than we got to during the Sprint}.
> >>>>
> >>>> ______________________________________________
> >>>> R-devel using r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >>
> > --
> > Hervé Pagès
>
> > Bioconductor Core Team hpages.on.github using gmail.com
>
>
>
More information about the R-devel
mailing list