Re: [statistics][numbers] Update the Gamma functions in Commons Numbers

Alex Herbert Tue, 26 Oct 2021 07:15:25 -0700

On Tue, 26 Oct 2021 at 13:54, Gilles Sadowski <gillese...@gmail.com> wrote:
<-- SNIP -->


> There is also
>   https://issues.apache.org/jira/browse/NUMBERS-167
> that definitely should entail a performance gain not achievable with the
> current API (a single "static" function).

I will keep this in mind when porting the Boost gamma function. This
enhancement was to avoid repeat computation of log(x) or logGamma(a).
The Boost implementation of the gamma function has many computation
paths depending on the parameters. Sometimes these factors would not
be required but there may be other constant expressions on alternate
branches that we can cache.

I plan to keep the Boost implementation in separate package private
classes. The existing RegularizedGamma class can call this. We can
then expose a new API to exploit the incomplete Gamma function as a
univariate function (with one parameter fixed).

The Boost library has implementations of all the gamma functions
currently in the gamma package. Most should be drop-in replacements
for the current functions. The Boost versions should be used if they
increase accuracy. This can be easily tested as the Boost source
provides high accuracy reference data.

The following functions would be additions to the library:

incomplete gamma : tgamma(a, z)
lower incomplete gamma : tgamma_lower(a, z)

(tgamma == true gamma)

These are computed using the same routines as the regularized
incomplete gamma functions P and Q. It would make sense to expose them
in a public API.

There is also the derivative of the regularized incomplete gamma
function. This is exposing some of the working of the incomplete gamma
function as this is the density function for the gamma distribution.
Exposing this will solve the issue with computation of the
GammaDistribution PDF which attempts to perform this function via the
Lanczos approximation but is not accurate for all parameters.

One functionally breaking change is a different Lanczos
implementation. It uses a different Lanczos factor g from the
implementation currently in the LanczosApproximation class. There is a
big section on this in their documentation on how they chose the
optimal constant g for each floating-point precision [1]. Their work
extends the work of Godfrey which is the basis for the numbers
implementation. Their functions uses g = 6.02 for a double. The
current implementation in numbers uses 4.74. However this number is
exposed via a method. Any use of the function value would have to
access the constant g via a method. Thus if the approximation is
changed then dependent code should continue to function correctly if
the Lanczos computation and g are updated  and if the dependent code
accesses g through the provided method.

One advantage of using the Boost Lanczos approximation is the
provision of a value that is pre-divided by exp(g). This has use in
computing logGamma. A comparison between the current and Boost
implementation for accuracy and speed should be done.

Alex

[1] 
https://www.boost.org/doc/libs/1_77_0/libs/math/doc/html/math_toolkit/lanczos.html

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [statistics][numbers] Update the Gamma functions in Commons Numbers

Reply via email to