Re: [Rd] [r-devel] integrate over an infinite region produces wrong results depending on scaling

Rui Barradas Tue, 04 Jun 2019 11:41:50 -0700

Hello,

A solution is to use package cubature, both unscaled and scaled versionsreturn the correct result, 3.575294.But the performance penalty is BIG. This is because the number ofevaluations is much bigger.


library(cubature)

cubintegrate(f, -Inf, 0, method = "hcubature")
#$integral
#[1] 3.575294
#
#$error
#[1] 1.494242e-07
#
#$neval
#[1] 375
#
#$returnCode
#[1] 0

cubintegrate(f, -Inf, 0, method = "hcubature", numstab = sc)
#$integral
#[1] 3.575294
#
#$error
#[1] 1.064195e-05
#
#$neval
#[1] 255
#
#$returnCode
#[1] 0



library(microbenchmark)

microbenchmark(
 base1 = integrate(f, -Inf, 0),
 base2 = integrate(f, -Inf, 0, numstab = sc),
 cuba1 = cubintegrate(f, -Inf, 0, method = "hcubature"),
 cuba2 = cubintegrate(f, -Inf, 0, method = "hcubature", numstab = sc)
)


Hope this helps,

Rui Barradas


Às 15:52 de 12/04/19, Andreï V. Kostyrka escreveu:

Dear all,
This is the first time I am posting to the r-devel list. OnStackOverflow, they suggested that the strange behaviour of integrate()was more bug-like. I am providing a short version of the question (fullone with plots: https://stackoverflow.com/q/55639401).
Suppose one wants integrate a function that is just a product of twodensity functions (like gamma). The support of the random variable is(-Inf, 0]. The scale parameter of the distribution is quite small(around 0.01), so often, the standard integration routine would fail tointegrate a function that is non-zero on a very small section of thenegative line (like [-0.02, -0.01], where it takes huge values, andalmost 0 everywhere else). R’s integrate would often return the machineepsilon as a result. So I stretch the function around the zero by aninverse of the scale parameter, compute the integral, and then divide itby the scale. Sometimes, this re-scaling also failed, so I did both ifthe first result was very small.
Today when integration of the rescaled function suddenly yielded a valueof 1.5 instead of 3.5 (not even zero). The MWE is below.
cons <- -0.020374721416129591
sc <- 0.00271245601724757383
sh <- 5.704
f <- function(x, numstab = 1) dgamma(cons - x * numstab, shape = sh,scale = sc) * dgamma(-x * numstab, shape = sh, scale = sc) * numstab
curve(f, -0.06, 0, n = 501, main = "Unscaled f", bty = "n")
curve(f(x, sc), -0.06 / sc, 0, n = 501, main = "Scaled f", bty = "n")

sum(f(seq(-0.08, 0, 1e-6))) * 1e-6 #  Checking by summation: 3.575294
sum(f(seq(-30, 0, 1e-4), numstab = sc)) * 1e-4 # True value, 3.575294
str(integrate(f, -Inf, 0)) # Gives 3.575294
# $ value       : num 3.58
# $ abs.error   : num 1.71e-06
# $ subdivisions: int 10
str(integrate(f, -Inf, 0, numstab = sc))
# $ value       : num 1.5 # What?!
# $ abs.error   : num 0.000145 # What?!
# $ subdivisions: int 2
It stop at just two subdivisions! The problem is, I cannot try variousstabilising multipliers for the function because I have to compute thisintegral thousands of times for thousands of parameter values onthousands of sample windows for hundreds on models, so even in thesuper-computer cluster, this takes weeks. Besides that, reducing therel.tol just to 1e-5 or 1e-6, helped a bit, but I am not sure whetherthis guarantees success (and reducing it to 1e-7 slowed down thecomputations in some cases). And I have looked at the Fortran code ofthe quadrature just to see the integration rule, and was wondering.
How can I make sure that the integration routine will not produce suchwrong results for such a function, and the integration will still be fast?
Yours sincerely,
Andreï V. Kostyrka

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [r-devel] integrate over an infinite region produces wrong results depending on scaling

Reply via email to