[Rd] problem with integrate
Ray Brownrigg
Ray.Brownrigg at mcs.vuw.ac.nz
Tue Nov 4 22:17:13 CET 2008
I am puzzled by some results I am getting from integrate(), which provides inconsistent
results in some circumstances.
Basically, when integrating a specific continuous function, 3 different answers are
possible, varying by much more than the error bound reported by integrate().
The following script demonstrates this:
# ---- start of script
par(mfrow=1:2)
# Define a continuous, almost everywhere differentiable, function
G <- function(y) pnorm(pmin(2.241, y), lower=F, log=T)*dnorm(y)
# define its integral as a sum of two integrals
FUN <- function(x) integrate(G, -Inf, x)$value + integrate(G, x, Inf)$value
# plot this over a small range
curve(Vectorize(FUN)(x), -1.901, -1.899)
# Note the error is about 0.01, but integrate() reports error < 5e-5
# Check two other ways of calculating the same integral, one wrong:
abline(h = integrate(G, -Inf, Inf)$value, co l= 2)
# and one "correct":
abline(h = (1 - pnorm(2.241))*log(1 - pnorm(2.241)) +
integrate(G, -Inf, 2.241)$value, col = 3)
# Investigation reveals that the problem is with the second integral (in FUN()):
FUN2 <- function(x) integrate(G, x, Inf)$value
# plot this over the same range
curve(Vectorize(FUN2)(x), -1.901, -1.899)
# which should be the same as:
abline(h = (1 - pnorm(2.241))*log(1 - pnorm(2.241)) +
integrate(G, x, 2.241)$value, col = 3)
# ---- end of script
My actual problem is a superset of this, i.e. the constant 2.241 is only one value from a
continuous range in (-4, 4). A similar problem occurs at more than a few different such
values.
I noticed this under version 2.7.1 (NetBSD), but have checked that it still occurs under a
recently patched 2.8.0 (Windows).
Ray Brownrigg
Mathematics Statistics and Computer Science
Victoria University of Wellington
More information about the R-devel
mailing list