[Rd] pbinom( ) function (PR#8700)
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Wed Mar 22 16:08:41 CET 2006
Duncan Murdoch <murdoch at stats.uwo.ca> writes:
> On 3/22/2006 3:52 AM, maechler at stat.math.ethz.ch wrote:
> >>>>>> "cspark" == cspark <cspark at clemson.edu>
> >>>>>> on Wed, 22 Mar 2006 05:52:13 +0100 (CET) writes:
> >
> > cspark> Full_Name: Chanseok Park Version: R 2.2.1 OS: RedHat
> > cspark> EL4 Submission from: (NULL) (130.127.112.89)
> >
> >
> >
> > cspark> pbinom(any negative value, size, prob) should be
> > cspark> zero. But I got the following results. I mean, if
> > cspark> a negative value is close to zero, then pbinom()
> > cspark> calculate pbinom(0, size, prob).
> >
> > >> pbinom( -2.220446e-22, 3,.1)
> > [1] 0.729
> > >> pbinom( -2.220446e-8, 3,.1)
> > [1] 0.729
> > >> pbinom( -2.220446e-7, 3,.1)
> > [1] 0
> >
> > Yes, all the [dp]* functions which are discrete with mass on the
> > integers only, do *round* their 'x' to integers.
> >
> > I could well argue that the current behavior is *not* a bug,
> > since we do treat "x close to integer" as integer, and hence
> > pbinom(eps, size, prob) with eps "very close to 0" should give
> > pbinom(0, size, prob)
> > as it now does.
> >
> > However, for esthetical reasons,
> > I agree that we should test for "< 0" first (and give 0 then) and only
> > round otherwise. I'll change this for R-devel (i.e. R 2.3.0 in
> > about a month).
> >
> > cspark> dbinom() also behaves similarly.
> >
> > yes, similarly, but differently.
> > I have changed it (for R-devel) as well, to behave the same as
> > others d*() , e.g., dpois(), dnbinom() do.
>
> Martin, your description makes it sound as though dbinom(0.3, size,
> prob) would give the same answer as dbinom(0, size, prob), whereas it
> actually gives 0 with a warning, as documented in ?dbinom. The d*
> functions only round near-integers to integers, where it looks as though
> near means within 1E-7. The p* functions round near integers to
> integers, and truncate others to the integer below.
Well, the p-functions are constant on the intervals between
integers... (Or, did you refer to the lack of a warning? One point
could be that cumulative p.d.f.s extends naturally to non-integers,
whereas densities don't really extend, since they are defined with
respect to counting measure on the integers.)
> I suppose the reason for this behaviour is to protect against rounding
> error giving nonsense results; I'm not sure that's a great idea, but if
> we do it, should we really be handling 0 differently?
Most of these round-near-integer issues were spurred by real
programming problems. It is somewhat hard to come up with a problem
that leads you generate a binomial variate value with "floating point
noise", but I'm quite sure that we'll be reminded if we try to change
it... (One potential issue is back-calculation to counts from relative
frequencies).
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list