[Rd] problem with integrate

Tue Nov 4 22:17:13 CET 2008

I am puzzled by some results I am getting from integrate(), which provides inconsistent 
results in some circumstances.

Basically, when integrating a specific continuous function, 3 different answers are 
possible, varying by much more than the error bound reported by integrate().

The following script demonstrates this:

# ---- start of script
par(mfrow=1:2)

# Define a continuous, almost everywhere differentiable, function
G <- function(y) pnorm(pmin(2.241, y), lower=F, log=T)*dnorm(y)

# define its integral as a sum of two integrals
FUN <- function(x) integrate(G, -Inf, x)$value + integrate(G, x, Inf)$value

# plot this over a small range
curve(Vectorize(FUN)(x), -1.901, -1.899)
# Note the error is about 0.01, but integrate() reports error < 5e-5

# Check two other ways of calculating the same integral, one wrong:
abline(h = integrate(G, -Inf, Inf)$value, co l= 2)

# and one "correct":
abline(h = (1 - pnorm(2.241))*log(1 - pnorm(2.241)) +
  integrate(G, -Inf, 2.241)$value, col = 3)

# Investigation reveals that the problem is with the second integral (in FUN()):
FUN2 <- function(x) integrate(G, x, Inf)$value

# plot this over the same range
curve(Vectorize(FUN2)(x), -1.901, -1.899)

# which should be the same as:
abline(h = (1 - pnorm(2.241))*log(1 - pnorm(2.241)) +
  integrate(G, x, 2.241)$value, col = 3)
# ---- end of script

My actual problem is a superset of this, i.e. the constant 2.241 is only one value from a 
continuous range in (-4, 4).  A similar problem occurs at more than a few different such 
values.

I noticed this under version 2.7.1 (NetBSD), but have checked that it still occurs under a 
recently patched 2.8.0 (Windows).

Ray Brownrigg
Mathematics Statistics and Computer Science
Victoria University of Wellington