[Rd] Do rowMeans and colMeans of complex vars need adjusting following r88444?

Dirk Eddelbuettel edd @end|ng |rom deb|@n@org
Tue Aug 26 00:09:18 CEST 2025


On 25 August 2025 at 17:56, Martin Maechler wrote:
| >>>>> Dirk Eddelbuettel 
| >>>>>     on Mon, 25 Aug 2025 05:46:56 -0500 writes:
| I really think a solution must happen in C code.

I tend to agree here. If I had seen an easy and obvious path to a fix for
this, I would have proposed a patch from the outset. But as I did not, this
email thread started instead to begin with a discussion.

An 'obvious' solution is the only one I can think of now: we could pass both
the real and the imaginary parts down to compute the mean of the imaginary
part. A new one-off function just for complex data seems like overkill but
maybe it is the way to go (doing one pass and computing real and imaginary
parts in that pass).

| Further, note that the
| 
|    if(is.complex(x))
| 	    .Internal(colMeans(Re(x), n, prod(dn), na.rm)) +
|        1i * .Internal(colMeans(Im(x), n, prod(dn), na.rm))
| 
| with na.rm = TRUE  has been *wrong* always, for all cases where
| is.na(Re(x))  and
| is.na(Im(x))  have differed ...

Prior to r88444, the NA assignment for complex data assigned to both the real
and imaginary part, making it unlikely both differed. As best as I can tell a
user would have had to explicitly set that. 

| So this problem is really a new one, only tangentially related to the
| original PR #18918  and also only loosely related Mikael's / my
| patch to fix PR #18918.

I see this differently.

Prior to r88444 / #18918 we could reliably call rowMeans() and colMeans() on
complex-valued matrices containing NA values. A unit test for Rcpp relied on
it (to validate its own row/col means method). Following r88444 this test in
Rcpp now breaks. And it breaks because the R side is now incorrect as we pass
the imaginary alone, with a finite value of zero where the real part has a NA
rendering the divisor for the mean values invalid. It really looks like a bug
that is a consequence of r88444 to me. Note that R also did not have a test
case for row or column means of complex valued data with NA or else this
would have bubbled up sooner. So apart from fixing the issue we should
probably add a test too.
 
| As you have unearthed it, I'm fine if you open an R bugzilla
| issue for it;  otherwise, I'd do it myself.

It doesn't matter to me. Maybe it is easier if you do it.

r88444 was a good change. Everybody seems to agree that assigning NA only to
the real part is cleaner. But we are having a small side effect here so let's
see about fixing that too.

Dirk

-- 
dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org



More information about the R-devel mailing list