[Rd] Suggested change to integrate.Rd

Spencer Graves spencer.graves at structuremonitoring.com
Wed Dec 8 14:45:21 CET 2010


Hi, John:


Maybe change it to something like "gives wrong answer without warning on 
many systems (see 'Note' above)", as the 'Note' does provide more detail.


Thanks,
Spencer


On 12/7/2010 8:08 PM, John Nolan wrote:
> R developers understand intimately how things work, and terse
> descriptions are sufficient.  However, most typical R users
> would benefit from clearer documentation.  In multiple places
> I've found the R documentation to be correct and understandable
> AFTER I've figured a function out.
>
> And to be fair, this problem with integrate( ) isn't really R's
> fault: the QUADPACK routines that R uses are very good algorithms,
> but neither they nor any other package can handle all cases.
>
> I would support reasonable changes in the documentation for
> integrate( ).   Just saying it "gives wrong answer without
> warning on many systems" seems misleading (it works fine in
> many cases) and it doesn't help a user understand how to use
> integrate( ) correctly/carefully.  IMO a simple example like
> this one w/ dnorm would catch peoples attention and a couple
> lines of explanation/warning would then make more sense.
>
> John Nolan, American U
>
>
> -----Spencer Graves<spencer.graves at structuremonitoring.com>  wrote: -----
> To: John Nolan<jpnolan at american.edu>
> From: Spencer Graves<spencer.graves at structuremonitoring.com>
> Date: 12/07/2010 07:58PM
> Cc: pchausse at uwaterloo.ca, r-devel at r-project.org
> Subject: Suggested change to integrate.Rd (was: Re: [Rd] 0.5 != integrate(dnorm,0,20000) = 0)
>
>         What do you think about changing the verbiage with that example
> in "integrate.Rd" from "fails on many systems" to something like
> "gives wrong answer without warning on many systems"?
>
>
>         If I had write access to the core R code, I'd change this
> myself:  I'm probably not the only user who might think that saying
> something "fails" suggest it gives an error message.  Many contributions
> on this thread make it clear that it will never be possible to write an
> integrate function that won't give a "wrong answer without warning" in
> some cases.
>
>
>         Thanks,
>         Spencer
>
>
> #############################
> On 12/7/2010 7:02 AM, John Nolan wrote:
>> Putting in Inf for the upper bound does not work in general:
>> all 3 of the following should give 0.5
>>
>>> integrate( dnorm, 0, Inf )
>> 0.5 with absolute error<   4.7e-05
>>
>>> integrate( dnorm, 0, Inf, sd=100000 )
>> Error in integrate(dnorm, 0, Inf, sd = 1e+05) :
>>     the integral is probably divergent
>>
>>> integrate( dnorm, 0, Inf, sd=10000000 )
>> 5.570087e-05 with absolute error<   0.00010
>>
>> Numerical quadrature methods look at a finite number of
>> points, and you can find examples that will confuse any
>> algorithm.  Rather than hope a general method will solve
>> all problems, users should look at their integrand and
>> pick an appropriate region of integration.
>>
>> John Nolan, American U.
>>
>>
>> -----r-devel-bounces at r-project.org wrote: -----
>> To: r-devel at r-project.org
>> From: Pierre Chausse
>> Sent by: r-devel-bounces at r-project.org
>> Date: 12/07/2010 09:46AM
>> Subject: Re: [Rd] 0.5 != integrate(dnorm,0,20000) = 0
>>
>>     The warning about "absolute error == 0" would not be sufficient
>> because if you do
>>    >   integrate(dnorm, 0, 5000)
>> 2.326323e-06 with absolute error<   4.6e-06
>>
>> We get reasonable absolute error and wrong answer. For very high upper
>> bound, it seems more stable to use "Inf". In that case, another
>> .External is used which seems to be optimized for high or low bounds:
>>
>>    >   integrate(dnorm, 0,Inf)
>> 0.5 with absolute error<   4.7e-05
>>
>>
>> On 10-12-07 8:38 AM, John Nolan wrote:
>>> I have wrestled with this problem before.  I think correcting
>>> the warning to "absolute error ~<= 0" is a good idea, and printing
>>> a warning if subdivisions==1 is also helpful.  Also, including
>>> a simple example like the one that started this thread on the
>>> help page for integrate might make the issue more clear to users.
>>>
>>> But min.subdivisions is probably not.  On the example with dnorm( ),
>>> I doubt 3 subdivisions would work.  The problem isn't that
>>> we aren't sudividing enough, the problem is that the integrand
>>> is 0 (in double precision) on most of the region and the
>>> algorithm isn't designed to handle this.  There is no way to
>>> determine how many subdivisions are necessary to get a reasonable
>>> answer without a detailed analysis of the integrand.
>>>
>>> I've gotten useful results with integrands that are monotonic on
>>> the tail with a "self-triming integration" routine
>>> like the following:
>>>
>>>> right.trimmed.integrate<- function( f, lower, upper, epsilon=1e-100, min.width=1e-10, ... ) {
>>> + # trim the region of integration on the right until f(x)>    epsilon
>>> +
>>> + a<- lower; b<- upper
>>> + while ( (b-a>min.width)&&    (f(b)<epsilon) ) { b<- (a+b)/2 }
>>> +
>>> + return( integrate(f,a,b,...) ) }
>>>
>>>> right.trimmed.integrate( dnorm, 0, 20000 )  # test
>>> 0.5 with absolute error<    9.2e-05
>>>
>>> This can be adapted to left trim or (left and right) trim, abs(f(x)-c)>epsilon,
>>> etc.  Setting the tolerances epsilon and min.width is an issue,
>>> but an explicit discussion of these values could encourage people to
>>> think about the problem in their specific case.  And of course, none
>>> of this guarantees a correct answer, especially if someone tries this
>>> on non-monotonic integrands with complicated 0 sets.  One could write
>>> a somewhat more user-friendly version where the user has to specify
>>> some property (or set of properties) of the integrand, e.g. "right-tail
>>> decreasing to 0", etc. and have the algorithm try to do smart
>>> trimming based on this.  But perhaps this getting too involved.
>>>
>>> In the end, there is no general solution because any solution
>>> depends on the specific nature of the integrand.  Clearer messages,
>>> warnings in suspicious cases like subdivisions==1, and a simple
>>> example explaining what the issue is in the help page would help
>>> some people.
>>>
>>> John
>>>
>>>     ...........................................................................
>>>
>>>     John P. Nolan
>>>     Math/Stat Department
>>>     227 Gray Hall
>>>     American University
>>>     4400 Massachusetts Avenue, NW
>>>     Washington, DC 20016-8050
>>>
>>>     jpnolan at american.edu
>>>     202.885.3140 voice
>>>     202.885.3155 fax
>>>     http://academic2.american.edu/~jpnolan
>>>     ...........................................................................
>>>
>>> -----r-devel-bounces at r-project.org wrote: -----
>>> To: r-devel at r-project.org, Prof Brian Ripley<ripley at stats.ox.ac.uk>
>>> From: Martin Maechler
>>> Sent by: r-devel-bounces at r-project.org
>>> Date: 12/07/2010 03:29AM
>>> Subject: Re: [Rd] 0.5 != integrate(dnorm,0,20000) = 0
>>>
>>>>>>>> Prof Brian Ripley<ripley at stats.ox.ac.uk>
>>>>>>>>        on Tue, 7 Dec 2010 07:41:16 +0000 (GMT) writes:
>>>        >    On Mon, 6 Dec 2010, Spencer Graves wrote:
>>>        >>    Hello:
>>>        >>
>>>        >>
>>>        >>    The example "integrate(dnorm,0,20000)" says it "fails on many systems".
>>>        >>    I just got 0 from it, when I should have gotten either an error or something
>>>        >>    close to 0.5.  I got this with R 2.12.0 under both Windows Vista_x64 and
>>>        >>    Linux (Fedora 13);  see the results from Windows below.  I thought you might
>>>        >>    want to know.
>>>
>>>        >    Well, isn't that exactly what the help page says happens?  That
>>>        >    example is part of a section entitled
>>>
>>>        >    ## integrate can fail if misused
>>>
>>>        >    and is part of the illustration of
>>>
>>>        >    If the function is
>>>        >    approximately constant (in particular, zero) over nearly all its
>>>        >    range it is possible that the result and error estimate may be
>>>        >    seriously wrong.
>>>
>>> yes, of course,
>>> and the issue has been known for ``ages''  ..
>>> ..........
>>> ..........
>>> but it seems that too many useRs are not reading the help
>>> page carefully, but only browse it quickly.
>>> I think we (R developers) have to live with this fact
>>> and should consider adapting to it a bit more, particularly in
>>> this case (see below)
>>>
>>>        >>
>>>        >>    Thanks for all your work in creating and maintaining R.
>>>        >>
>>>        >>
>>>        >>    Best Wishes,
>>>        >>    Spencer Graves
>>>        >>    ###############################
>>>
>>>        >>
>>>        >>    integrate(dnorm,0,20000) ## fails on many systems
>>>
>>>        >>    0 with absolute error<    0
>>>
>>> and this is particularly unsatisfactory for another reason:
>>>
>>> "absolute error<    0"
>>> is *always* incorrect, so I think we should change *some*thing.
>>>
>>> We could just use "<=" (and probably should in any case, or
>>> "<    ~= x" which would convey ``is less than about x'' which I
>>> think is all we can say),
>>> but could consider giving a different message when the integral
>>> evaluates to 0 or, rather actually,
>>> only when the error bound ('abs.error') is 0 *and* 'subdivisions == 1'
>>> as the latter indicates that the algorithm treated the integrand
>>> f(.) as if f() was a linear function.
>>>
>>> But in my quick experiments, even for linear (incl. constant)
>>> functions, the 'abs.error' returned is never 0.
>>>
>>> If we want to be cautious,
>>> such a warning could be made explicitly suppressable by an argument
>>>          .warn.if.doubtful = TRUE
>>>
>>> An additional possibility I'd like to try, is a new argument
>>>       'min.subdivisions = 3' which specifies the *minimal* number
>>> of subdivisions to be used in addition to the already present
>>>       'subdivisions = 100' (= the maximum number of subintervals.)
>>>
>>> Martin Maechler,
>>> ETH Zurich
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>



More information about the R-devel mailing list