The proposed computation of $\ln{x \choose y}$ via the logarithm of the beta function does not give rise to catastrophic cancellation other than for cases where $y=0$, or $y =x$, which are easily handled as special cases resulting in zero.
However, the computation of $\ln\mathrm{B}(a,b)=\ln\left(\frac{\Gamma(a)\Gamma(b)}{\Gamma(a + b)}\right)$ involves cases of subtractive cancellation, not all of which are easily addressed. I was unable to find clearly stated error bounds for some of the existing implementations of the logarithm of the beta function, but spot checks suggest that errors equivalent to a loss of four to five least significant bits are not uncommon.
Programming languages in the C/C++ family typically offer a standard math function lgamma which can be used to compute $\ln\mathrm{B}(a,b)$ as (lgamma (b) - lgamma (a + b)) + lgamma (a). This creates an obvious issue with subtractive cancellation. For computation with gamma functions of large arguments, a function $\Gamma^{\star}$ was introduced in N. M. Temme, "A set of algorithms for the incomplete gamma functions", Probability in the Engineering and Informational Sciences, Vol. 8, No. 2, April 1994, pp. 291-307, and called "tempered" gamma function, with
$\Gamma(x) = \sqrt{2\pi}e^{-x}x^{x-\frac{1}{2}}\Gamma^{\star}(x), \ \ x \gt 0.$ This also appears in Amparo Gil, Javier Segura, and Nico M. Temme, Numerical Methods for Special Functions, SIAM 2007, p. 243, defined as $\Gamma^{\star}(a)$=$\sqrt{\frac{a}{2\pi}}e^{a}a^{-a}\Gamma(a), a\gt 0$.
More recently the name of $\Gamma^{\star}$ was changed to regulated gamma function in Amparo Gil, Javier Segura, and Nico M. Temme, "GammaCHI: a package for the inversion and computation of the gamma and chi-square cumulative distribution functions (central and noncentral)," Computer Physics Communications, Vol. 191, June 2015, pp. 132-139. In the authors' GNSTLIB library this function is included as gammastar. With this in place, we have:
$$\ln\mathrm{B}(a,b) = \ln\left(\frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}\right) = \ln\left(\frac{\Gamma^{\star}(a)\Gamma^{\star}(b)}{\Gamma^{\star}(a+b)}\cdot\frac{\sqrt{2\pi}}{a^{-a}b^{-b}\sqrt{ab}(a+b)^{(a+b-\frac{1}{2})}}\right)$$
The right-hand side gives us many opportunities for re-arranging the corresponding floating-point computation such that the subtractive cancellation is largely avoided, and in particular avoided for the portion of the domain exercised by the computation of the logarithm of binomial coefficients, which is $a \ge 2, b \ge 2$.
Not being a numerical analyst, I used a brute force approach of generating (using Wolfram Alpha for assistance) dozens of possible arrangements and evaluating each with millions of test vectors to identify the most suitable candidates which I then subjected to further tests. I also created an implementation of the regulated gamma function $\Gamma^{\star}(a)$ that has a maximum error of a little more than half an ulp for $a \ge 1$. To keep the time for development and test reasonable, I focused on a single-precision implementation, but there is no reason to believe the results would not transfer to a corresponding double-precision implementation.
Based on testing with a billion random test vectors, with $x \in [1, 2^{24}]$ and $y \in [0, x]$, the maximum observed error in the exemplary ISO-C99 implementation logbinomialf (x,y) shown below for the computation of $\ln{x \choose y}$ is $\lt 2.5$ ulp. I used an Intel C/C++ compiler with its highly accurate standard math library.
#include <math.h>
/* Compute regulated gamma function, G(a) = sqrt(a/2pi)exp(a)pow(a,-a)G(a) /
float gammastarf (float x)
{
float r, s;
if (x <= 0.0f) {
r = INFINITY /INFINITY; // NaN
} else if (x < 1.0f) {
/* (0, 1): maximum error 4.17 ulp */
if (x < 3.5e-9f) {
const float oosqrt2pi = 0.39894228040143267794f; // 1/sqrt(2pi)
r = oosqrt2pi / sqrtf (x);
} else {
r = 1.0f / x;
r = gammastarf (x + 1.0f) * expf (fmaf (x, log1pf (r), -1.0f)) *
sqrtf (1.0f + r);
}
} else {
/* [1, INF]: maximum error 0.56 ulp */
r = 1.0f / x;
s = 1.24156475e-4f; // 0x1.046000p-13
s = fmaf (s, r, -5.94461802e-4f); // -0x1.37ab50p-11
s = fmaf (s, r, 1.07185589e-3f); // 0x1.18fb08p-10
s = fmaf (s, r, -2.95249280e-4f); // -0x1.359760p-12
s = fmaf (s, r, -2.67400569e-3f); // -0x1.5e7cbep-9
s = fmaf (s, r, 3.47192190e-3f); // 0x1.c7125ep-9
s = fmaf (s, r, 8.33333358e-2f); // 0x1.555556p-4
r = fmaf (s, r, 1.00000000e+0f); // 0x1.000000p+0
}
return r;
}
/* Compute the logarithm of the beta function, ln(B(x,y)) */
float logbetaf (float x, float y)
{
const float sqrt_2pi = 2.506628274631000502416f;
float mn, mx, sum, r;
mx = fmaxf (x, y);
mn = fminf (x, y);
sum = x + y;
if (isnan (x) || isnan (y)) {
r = sum;
} else if ((mn <= 0) || ((x == INFINITY) && (y == INFINITY))) {
r = INFINITY / INFINITY; // NaN
} else if (mx == INFINITY) {
r = mx;
} else if (mx >= 1.7265625f) {
r = fmaf (mn - 0.5f, logf (mn / sum),
fmaf (mx, log1pf (-(mn / sum)),
fmaf (-0.5f, logf (mx),
logf (sqrt_2pi * gammastarf (mn) *
gammastarf (mx) / gammastarf (sum)))));
} else {
r = (lgammaf (mn) - lgammaf (sum)) + lgammaf (mx);
}
return r;
}
/* Compute logarithm of the binomial coefficient, ln(x choose y) */
float logbinomialf (int x, int y)
{
if ((y == 0) || (x == y)) {
return 0.0f;
} else {
return -logbetaf (y + 1, x - y + 1) - log1pf (x);
}
}
The corresponding double-precision implementation is structurally identical. The only obstacle I faced is creating an accurate approximation of $\Gamma^{\star}$ on $[1, \infty]$. This is easy to do in single precision because one can simply check the approximation exhaustively, but a major challenge in double precision. Based on tests with 400M test vectors, the maximum error found in the gammastar() implementation below for arguments in $[1, \infty]$ is $\lt 0.676$ ulps. The maximum error encountered in logbinomial(x, y) was $\lt 2.7$ ulps for $x \in [1, 2^{52}]$ and $y \in [0, x]$.
/* Compute regulated gamma function, G*(a) = sqrt(a/2*pi)*exp(a)*pow(a,-a)*G(a) */
double gammastar (double x)
{
double r, p, q;
if (x <= 0) {
r = INFINITY / INFINITY; // NaN
} else if (x < 1.0) {
/* (0, 1): maximum error 4.27 ulp */
if (x < 6e-18) {
const double oosqrt2pi = 0.39894228040143267794; // 1/sqrt(2*pi)
r = oosqrt2pi / sqrt (x);
} else if (x < 1.0) {
r = (gammastar (x + 1.0) * exp (fma (x, log1p (1.0 / x), -1.0)) *
sqrt (1.0 + 1.0 / x));
}
} else {
/* [1, INF]: maximum error 0.676 ulp */
r = 1.0 / x;
p = 3.6824880875937455e-1; // 0x1.79163739a65f1p-2
p = fma (p, r, -2.9148700584831680e+1); // -0x1.d26113dd4bf86p+4
p = fma (p, r, -6.4230334440567276e+2); // -0x1.4126d3fd4ee50p+9
p = fma (p, r, -2.3417416308098223e+3); // -0x1.24b7bb70893c9p+11
p = fma (p, r, -4.0779740667249484e+3); // -0x1.fdbf2b8dfaf8dp+11
p = fma (p, r, -3.9854803054899076e+3); // -0x1.f22f5ea99e67cp+11
p = fma (p, r, -2.3942993206051601e+3); // -0x1.2b499408ce45dp+11
p = fma (p, r, -8.1990473234481556e+2); // -0x1.99f3ce44fc4eep+9
p = fma (p, r, -1.5557398039212819e+2); // -0x1.3725e0c20978cp+7
q = r - 6.4292649037414617e+2; // -0x1.4176973c90822p+9
q = fma (q, r, -7.9662250277109015e+3); // -0x1.f1e399b6a8304p+12
q = fma (q, r, -2.7600820950887806e+4); // -0x1.af4348a7597b0p+14
q = fma (q, r, -4.7901553699890872e+4); // -0x1.763b1b7e8d563p+15
q = fma (q, r, -4.6962269682919628e+4); // -0x1.6ee48a13e1303p+15
q = fma (q, r, -2.8384949448086998e+4); // -0x1.bb83cc3c1e8bap+14
q = fma (q, r, -9.7610697979418601e+3); // -0x1.31088ef2392cfp+13
q = fma (-1.8668877647055378e+3, x, q); // -0x1.d2b8d1230e350p+10
r = fma (p, 1.0 / q, 1.0);
}
return r;
}
/* Compute the logarithm of the beta function, ln(B(x,y)) */
double logbeta (double x, double y)
{
const double sqrt_2pi = 2.506628274631000502416;
double mn, mx, sum, r;
mx = fmax (x, y);
mn = fmin (x, y);
sum = x + y;
if (isnan (x) || isnan (y)) {
r = sum;
} else if ((mn <= 0) || ((x == INFINITY) && (y == INFINITY))) {
r = INFINITY / INFINITY; // NaN
} else if (mx == INFINITY) {
r = mx;
} else if (mx >= 1.75) {
r = fma (mn - 0.5, log (mn / sum),
fma (mx, log1p (-(mn / sum)),
fma (-0.5, log (mx),
log (sqrt_2pi * gammastar (mn) *
gammastar (mx) / gammastar (sum)))));
} else {
r = (lgamma (y) - lgamma (sum)) + lgamma (x);
}
return r;
}
/* Compute logarithm of the binomial coefficient, ln(x choose y) */
double logbinomial (long long int x, long long int y)
{
if ((y == 0) || (x == y)) {
return 0.0;
} else {
return -logbeta (y + 1, x - y + 1) - log1p (x);
}
}
gsl_sf_lnchoose(), which appears to implement what you are looking for. – njuffa Nov 02 '22 at 02:00GSL.sf_lnchoose(4, 0) # 0.0– a06e Nov 02 '22 at 07:51lgamma(x + 1) - lgamma(y + 1) - lgamma(x - y + 1). – Stéphane Laurent Nov 28 '22 at 11:49