[Rd] choose(n,k) when n is almost integer

Wed Dec 23 23:42:23 CET 2009

In a previous email, i suggested two patches A and B to choose(n, k), which
solve some of its problems, but keep some of the inaccuracies of the original
implementation. I would like to suggest another patch, which i will call C
(patch-C.txt in an attachment), which eliminates the warnings obtained sometimes
from the original implementation and which is more accurate in all ranges of
the output.

For testing patch C a simpler script is sufficient, since we need not to take
care of the warnings. Namely
  http://www.cs.cas.cz/~savicky/R-devel/test_choose_1.R
which produces the output (the error is the relative error)

  > source("test_choose_1.R")
  k <= 0 max err = 0 
  k <= 10 max err = 1.332268e-15 
  k <= 20 max err = 2.442491e-15 
  k <= 30 max err = 3.774758e-15 
  k <= 40 max err = 2.553513e-14 
  k <= 50 max err = 2.88658e-14 
  k <= 60 max err = 3.197442e-14 
  k <= 70 max err = 4.396483e-14 
  k <= 80 max err = 4.685141e-14 
  k <= 90 max err = 4.907186e-14 
  k <= 100 max err = 5.084821e-14 
  k <= 110 max err = 5.373479e-14 
  k <= 120 max err = 5.551115e-14 
  k <= 130 max err = 7.41629e-14 
  k <= 140 max err = 9.592327e-14 
  k <= 150 max err = 9.636736e-14 
  k <= 160 max err = 9.725554e-14 
  k <= 170 max err = 9.947598e-14 
  k <= 180 max err = 1.04583e-13 
  k <= 190 max err = 1.088019e-13 
  k <= 200 max err = 1.090239e-13 
  minimum log2() of a wrong result for integer n : 53.32468 
  maximum error for real n : 1.090239e-13 

Increasing accuracy of choose(n, k) for n almost an integer needed to use
additional transformations of it to those already used in the code. I will
work out a description of these transformations and send a link to it.
Similarly as patches A and B, patch C also does not modify lchoose().

It should be pointed out that choose(n, k) for non-integer n is mainly
needed if n is a rational number like 1/2, 1/3, 2/3, .... However, making
choose(n, k) accurate for all inputs seems to be not too complex as the
patch C and its test results show.

I appreciate comments on patch C.

Petr Savicky.

-------------- next part --------------

--- R-devel-orig-intel/src/nmath/choose.c	2009-12-17 17:52:39.000000000 +0100
+++ R-devel-work-copy-3-intel/src/nmath/choose.c	2009-12-23 21:59:40.000000000 +0100
@@ -93,13 +93,14 @@
     return lfastchoose(n, k);
 }
 
+#define IS_INT(x)	  ((x) == floor(x))
 #define k_small_max 30
 /* 30 is somewhat arbitrary: it is on the *safe* side:
  * both speed and precision are clearly improved for k < 30.
 */
 double choose(double n, double k)
 {
-    double r, k0 = k;
+    double r, k0 = k, l, aux;
     k = floor(k + 0.5);
 #ifdef IEEE_754
     /* NaNs propagated correctly */
@@ -109,36 +110,37 @@
 	MATHLIB_WARNING2(_("'k' (%.2f) must be integer, rounded to %.0f"), k0, k);
     if (k < k_small_max) {
 	int j;
-	if(n-k < k && n >= 0 && R_IS_INT(n)) k = n-k; /* <- Symmetry */
+	if(n-k < k && n >= 0 && IS_INT(n)) k = n-k; /* <- Symmetry */
 	if (k <	 0) return 0.;
 	if (k == 0) return 1.;
 	/* else: k >= 1 */
 	r = n;
 	for(j = 2; j <= k; j++)
-	    r *= (n-j+1)/j;
-	return R_IS_INT(n) ? floor(r + 0.5) : r;
+	    r *= (n-(j-1))/j;
+	return IS_INT(n) ? floor(r + 0.5) : r;
 	/* might have got rounding errors */
     }
     /* else: k >= k_small_max */
     if (n < 0) {
-	r = choose(-n+ k-1, k);
-	if (ODD(k)) r = -r;
+	r = n / k * choose(k - 1.0 - n, k - 1.0);
+	if (ODD(k - 1.0)) r = -r;
 	return r;
     }
-    else if (R_IS_INT(n)) {
+    else if (IS_INT(n)) {
 	if(n < k) return 0.;
 	if(n - k < k_small_max) return choose(n, n-k); /* <- Symmetry */
 	return floor(exp(lfastchoose(n, k)) + 0.5);
     }
     /* else non-integer n >= 0 : */
-    if (n < k-1) {
-	int s_choose;
-	r = lfastchoose2(n, k, /* -> */ &s_choose);
-	return s_choose * exp(r);
+    l = floor(n + 0.5);
+    if (l <= k-1) {
+	aux = lfastchoose(n, l) + lfastchoose(k - 1.0 - n, k - l - 1.0) - lfastchoose(k, l);
+	return exp(aux) * (n - l) / (k - l) * (ODD(k - l) ? 1.0 : - 1.0);
     }
     return exp(lfastchoose(n, k));
 }
 
 #undef ODD
+#undef IS_INT
 #undef R_IS_INT
 #undef k_small_max