[R] choose(n, k) as n approaches k
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Tue Jan 14 17:02:20 CET 2020
On 14/01/2020 10:50 a.m., peter dalgaard wrote:
>
>
>> On 14 Jan 2020, at 16:21 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>>
>> On 14/01/2020 10:07 a.m., peter dalgaard wrote:
>>> Yep, that looks wrong (probably want to continue discussion over on R-devel)
>>> I think the culprit is here (in src/nmath/choose.c)
>>> if (k < k_small_max) {
>>> int j;
>>> if(n-k < k && n >= 0 && R_IS_INT(n)) k = n-k; /* <- Symmetry */
>>> if (k < 0) return 0.;
>>> if (k == 0) return 1.;
>>> /* else: k >= 1 */
>>> if n is a near-integer, then k can become non-integer and negative. In your case,
>>> n == 4 - 1e-7
>>> k == 4
>>> n - k == -1e-7 < 4
>>> n >= 0
>>> R_IS_INT(n) = TRUE (relative diff < 1e-7 is allowed)
>>> so k gets set to
>>> n - k == -1e-7
>>> which is less than 0, so we return 0. However, as you point out, 1 would be more reasonable and in accordance with the limit as n -> 4, e.g.
>>>> factorial(4 - 1e-10)/factorial(1e-10)/factorial(4) -1
>>> [1] -9.289025e-11
>>> I guess that the fix could be as simple as replacing n by R_forceint(n) in the k = n - k step.
>>
>> I think that would break symmetry: you want choose(n, k) to equal choose(n, n-k) when n is very close to an integer. So I'd suggest the replacement whenever R_IS_INT(n) is true.
>>
>
> But choose() very deliberately ensures that k is integer, so choose(n, n-k) is ill-defined for non-integer n.
That's only true if there's a big difference. I'd be worried about
cases where n and k are close to integers (within 1e-7). In those
cases, k is silently rounded to integer. As I read your suggestion, n
would only be rounded to integer if k > n-k. I think both n and k
should be rounded to integer in this near-integer situation, regardless
of the value of k.
I believe that lchoose(n, k) already does this.
Duncan Murdoch
>
> double r, k0 = k;
> k = R_forceint(k);
> ...
> if (fabs(k - k0) > 1e-7)
> MATHLIB_WARNING2(_("'k' (%.2f) must be integer, rounded to %.0f"), k0, k);
>
>
>> Duncan Murdoch
>>
>>> -pd
>>>> On 14 Jan 2020, at 00:33 , Wright, Erik Scott <ESWRIGHT using pitt.edu> wrote:
>>>>
>>>> This struck me as incorrect:
>>>>
>>>>> choose(3.999999, 4)
>>>> [1] 0.9999979
>>>>> choose(3.9999999, 4)
>>>> [1] 0
>>>>> choose(4, 4)
>>>> [1] 1
>>>>> choose(4.0000001, 4)
>>>> [1] 4
>>>>> choose(4.000001, 4)
>>>> [1] 1.000002
>>>>
>>>> Should base::choose(n, k) check whether n is within machine precision of k and return 1?
>>>>
>>>> Thanks,
>>>> Erik
>>>>
>>>> ***
>>>> sessionInfo()
>>>> R version 3.6.0 beta (2019-04-15 r76395)
>>>> Platform: x86_64-apple-darwin15.6.0 (64-bit)
>>>> Running under: macOS High Sierra 10.13.6
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list