[R] Integral of PDF
(Ted Harding)
ted.harding at wlandres.net
Fri Dec 3 00:26:47 CET 2010
Albyn's reply is in line with an hypothesis I was beginning to
formulate (without looking at the underlying FoRTRAN code),
prompted by the hint in '?integrate':
Details:
If one or both limits are infinite, the infinite range
is mapped onto a finite interval.
For a finite interval, globally adaptive interval
subdivision is used in connection with extrapolation
by the Epsilon algorithm.
as a result of some numerical experiments. Following up on
Harold Doran's original dnorm(x,mean.mean/10) pattern (and
quoting a small subset of the experiments ... ):
integrate(function(x) dnorm(x, 100,10), -Inf, Inf)
# 1 with absolute error < 0.00012
integrate(function(x) dnorm(x, 200,20), -Inf, Inf)
# 1.429508e-08 with absolute error < 2.8e-08
integrate(function(x) dnorm(x, 140,14), -Inf, Inf)
# 1 with absolute error < 2.2e-06
integrate(function(x) dnorm(x, 150,15), -Inf, Inf)
# 2.261582e-05 with absolute error < 4.4e-05
integrate(function(x) dnorm(x, 144,14.4), -Inf, Inf)
# 1 with absolute error < 1.7e-06
integrate(function(x) dnorm(x, 145,14.5), -Inf, Inf)
# 5.447138e-05 with absolute error < 0.00011
integrate(function(x) dnorm(x, 150,15), -33000, 33300)
# 1 with absolute error < 0.00012
integrate(function(x) dnorm(x, 150,15), -34000, 34300)
# 5.239671e-05 with absolute error < 0.00010
integrate(function(x) dnorm(x, 150,15),
-33768.260234277, 34068.2602334277)
# 0.5000307 with absolute error < 6.1e-05
integrate(function(x) dnorm(x, 150,15),
-33768.260234278, 34068.2602334278)
# 6.139415e-05 with absolute error < 0.00012
So it seems that, depending on how integrate() "maps to a finite
interval", and on how the algorithm goes about its "adaptive
interval subdivision", there are critical points where integrate()
switches from one to another of the following:
[A] The whole of a finite interval containing all but the extreme
outer tails of dnorm() is integrated over;
[B] The whole of a finite interval containing one half of the
distribution of dnorm() is integrated over;
[C] The whole of a finite interval lying in the extreme of one
tail of the dnorm distribution is integrated over.
with result: [A] close to 1; [B] close to 0.5; [C] close to 0.
This is compatible with Albyn's conclusion that the integral is
split into the sum of (-Inf,0) and (0,Inf), with the algorithm
ignoring one, or the other, or both, or neither!
This must be one of the nastiest exemplar's of "rounding error" ever!!!
Ted.
On 02-Dec-10 22:39:37, Albyn Jones wrote:
> To really understaand it you will have to look at the fortran code
> underlying integrate. I tracked it back through a couple of layers
> (dqagi, dqagie, ... just use google, these are old netlib
> subroutines) then decided I ought to get back to grading papers :-)
>
> It looks like the integral is split into the sum of two integrals,
> (-Inf,0] and [0,Inf).
>
>
>> integrate(function(x) dnorm(x, 350,50), 0, Inf)
> 1 with absolute error < 1.5e-05
>> integrate(function(x) dnorm(x, 400,50), 0, Inf)
> 1.068444e-05 with absolute error < 2.1e-05
>> integrate(function(x) dnorm(x, 500,50), 0, Inf)
> 8.410947e-11 with absolute error < 1.6e-10
>> integrate(function(x) dnorm(x, 500,50), 0, 1000)
> 1 with absolute error < 7.4e-05
>
> The integral from 0 to Inf is the lim_{x -> Inf} of the integral
> over [0,x]. I suspect that the algorithm is picking an interval
> [0,x], evaluating the integral over that interval, and then performing
> some test to decide whether to expand the interval. When the initial
> interval doesn't contain much, expanding a little won't add enough to
> trigger another iteration.
>
> albyn
>
> On Thu, Dec 02, 2010 at 04:21:34PM -0500, Doran, Harold wrote:
>> The integral of any probability density from -Inf to Inf should equal
>> 1, correct? I don't understand last result below.
>>
>> > integrate(function(x) dnorm(x, 0,1), -Inf, Inf)
>> 1 with absolute error < 9.4e-05
>>
>> > integrate(function(x) dnorm(x, 100,10), -Inf, Inf)
>> 1 with absolute error < 0.00012
>>
>> > integrate(function(x) dnorm(x, 500,50), -Inf, Inf)
>> 8.410947e-11 with absolute error < 1.6e-10
>>
>> > all.equal(integrate(function(x) dnorm(x, 500,50), -Inf, Inf)$value,
>> > 0)
>> [1] TRUE
>>
>> > sessionInfo()
>> R version 2.10.1 (2009-12-14)
>> i386-pc-mingw32
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
>> States.1252
>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] statmod_1.4.6 mlmRev_0.99875-1 lme4_0.999375-35
>> Matrix_0.999375-33 lattice_0.17-26
>>
>> loaded via a namespace (and not attached):
>> [1] grid_2.10.1 nlme_3.1-96 stats4_2.10.1 tools_2.10.1
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Albyn Jones
> Reed College
> jones at reed.edu
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at wlandres.net>
Fax-to-email: +44 (0)870 094 0861
Date: 02-Dec-10 Time: 23:26:44
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list