[R] R numerical integration

peter dalgaard pdalgd at gmail.com
Sat Mar 24 12:00:53 CET 2012


On Mar 24, 2012, at 09:46 , Petr Savicky wrote:

> Integrating with infinite limits is necessarily a heuristic.

...as is numerical integration in general. In the present case, the infinite limits are actually only half the problem. The integrate() function is usually quite good at dealing with infinite limits, but it sort of assumes that "most" of the functions behavior is played out relatively close to the origin. This is not really the case for a normal density with an sd of 17000+, so I tried rescaling:

> integrate(function(x) dnorm(x,mu/sigma)/(1+exp(-x*sigma)),-Inf,Inf)
0.4999626 with absolute error < 4.7e-05

Nice. I then thought I was smart and recentered it as well:

> integrate(function(x) dnorm(x)/(1+exp(-mu-x*sigma)),-Inf,Inf,rel.tol=1e-10)
0.5 with absolute error < 1.2e-11

and is sticks to 0.5 unless you reduce sigma to 100 or so. Oops.

As you probably noticed, the two integrands are the same function shifted left/right by mu/sigma, i.e. by roughly 1e-4, so the integral over the whole line is of course supposed to be the same. 

This is where the second heuristic of numerical integration shows up: It is assumed that the integral is continuous, not just in the pure math sense but also in the pragmatic sense of being interpolatable from a handful of function values. 

Now, the function 1/(1+exp(-x*sigma)) is, for large sigma, in principle continuous, but in practice pretty darn close to a Heaviside step function:

> f <- function(x)1/(1+exp(-x*sigma))
> f(0)
[1] 0.5
> f(0.001)
[1] 1
> f(-0.001)
[1] 2.424004e-08

this means that you have to be pretty lucky with the selection of knot points for integration, unless you give the algorithm a hint that something might be happening very quickly in certain regions. E.g.

> integrate(function(x) dnorm(x)/(1+exp(-mu-x*sigma)),-Inf,0,rel.tol=1e-10)
2.208542e-20 with absolute error < 4.4e-20
> integrate(function(x) dnorm(x)/(1+exp(-mu-x*sigma)),-.01,0,rel.tol=1e-10)
4.014838e-06 with absolute error < 2.1e-14

The first one of those is obviously wrong since it is suppose to be larger than the other one. However, what is presumably happening is that the integration algorithm looks at values "reasonably close" to the interval endpoint and gets, like,

> f2
function(x) dnorm(x)/(1+exp(-mu-x*sigma))
> f2(-.01)
[1] 5.392315e-78

and concludes that the integrand is essentially zero over the domain of integration.

One might consider splitting the domain like this:

> a <- 8/sigma
> I1 <- integrate(function(x) dnorm(x,mu/sigma)/(1+exp(-x*sigma)),-Inf,-a,rel.tol=1e-10)$value
> I2 <- integrate(function(x) dnorm(x,mu/sigma)/(1+exp(-x*sigma)),-a,a,rel.tol=1e-10)$value
> I3 <- integrate(function(x) dnorm(x,mu/sigma)/(1+exp(-x*sigma)),a,Inf,rel.tol=1e-10)$value
> I1+I2+I3
[1] 0.4999626

(I'm still not quite happy with the fact that I1 gets essentially set to zero in the above, but the essential point is that you need to be prepared to do a little analytical work to deal with tricky integrands.)

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list