[Rd] max.col oddity

John Zedlewski jzedlewski at hbs.edu
Sat Dec 16 01:09:42 CET 2006


I've noticed that the max.col function with the default "random"
option often gives unexpected results. For instance, in this test, it
seems clear what the answer should be:

> # second col should always be max
> x1 = cbind(1:10, 2:11, -Inf)
>
> # this works fine
> max.col(x1, "first")
 [1] 2 2 2 2 2 2 2 2 2 2
>
> # this gives random answers
> max.col(x1)
> [1] 3 1 1 2 3 3 1 3 1 1

Ouch! max.col is randomizing across all values.
Even without infinite values, something similar can happen:

> # test 2 --- tolerance problems
>
> # clearly column 3 is the max
> x1 = cbind(-1e9 * 1:10, 1:10, 2:11)
>
> # again, first method works:
> max.col(x1, "first")
 [1] 3 3 3 3 3 3 3 3 3 3
>
> # but random doesn't
> max.col(x1)
 [1] 2 3 2 3 3 2 2 2 3 2
>

The max.col docs say " there is a relative tolerance of 1e-5, relative
to the largest entry in the row", but it's really using the maximum
absolute value entry in the row (appl/maxcol.c, line 35 in R 2.4.0).
Is this necessary for some sort of S-plus compatibility? If so, I
think it would be good to make this absolute value property very clear
in the docs, since it can cause subtle bugs (and did for me).

Personally, I think the behavior is much nicer with the following patch:

--- rplain/R-2.4.0/src/appl/maxcol.c    2006-04-09 18:19:58.000000000 -0400
+++ R-2.4.0/src/appl/maxcol.c   2006-12-14 15:30:56.000000000 -0500
@@ -26,13 +26,14 @@
       do_rand = *ties_meth == 1;

    for (r = 0; r < n_r; r++) {
-       /* first check row for any NAs and find the largest abs(entry) */
+       /* first check row for any NAs and find the largest entry */
       large = 0.0;
       isna = FALSE;
       for (c = 0; c < *nc; c++) {
           a = matrix[r + c * n_r];
           if (ISNAN(a)) { isna = TRUE; break; }
-           if (do_rand) large = fmax2(large, fabs(a));
+           if (!R_FINITE(a)) continue;
+           if (do_rand) large = fmax2(large, a);
       }
       if (isna) { maxes[r] = NA_INTEGER; continue; }

---------------- END   ----------------------

This gives the expected behavior in the two examples above.

(Sorry to crosspost to both this list and R-help, but it was suggested
that R-devel would be a more appropriate forum for this.)



More information about the R-devel mailing list