I am interested in estimating the integral $\int f(x) P(x) dx$, where $f(x)$ is an expensive function and $P(x)$ is has an analytic form. I would like to evaluate this with as few evaluations of $f(x)$ as possible, so I am exploring Bayesian Quadrature.
If $P(x)$ is a uniform distribution, the paradigm of Bayesian quadrature makes perfect sense. We want to reduce the uncertainty of a Gaussian Process (GP) model, and we choose the next point to accomplish that. However, for more general $P(x)$, it seems like the best point should depend on the form of $P(x)$.
Is there a way to perform this Bayesian Quadrature that is influenced by the form of $P(x)$?
I have included my best attempt below.
In the example,
$f(x)=\sum_{i=1}^{\mathrm{dim}(x)}x_i^2$,
and $P(x)$ is a Gaussian distribution centered at (1,1) and with unit variance,
$P(x)=\Pi_{i=1}^{\mathrm{dim}(x)}\frac{1}{2\pi}\exp\left(-\frac{(x_i-1)^2}{2}\right)$.
I used the emukit package to perform the Bayesian quadrature according to the link I posted above. I modified the prediction of the GP to multiply the mean prediction by $P(x)$, but I have since realized that the Bayesian Quadrature method only relies on the variance of the output, not on the predicted mean.
Emukit provides a method to quickly compute the mean and variance of the integral, but I can not make it work with $P(x)$ represented outside of the GP. So, I run a large Monte Carlo simulation of the GP model output, using $P(x)$ to draw samples of $f(x)$, to estimate the value of $\int f(x) P(x) dx$.
My issue is that this sampling scheme does not depend at all on $P(x)$. It really seems like there must be a way to inform the Bayesian Quadrature to hone in on samples near (1,1), since this is where the majority of the contributions from the PDF are. However, it seems like there is no real way for me to do that, beyond training the GP model directly on P(x)f(x), which would remove the information about the value of $P(x)$ outside the sampled points when performing the integration.
from emukit.quadrature.methods import VanillaBayesianQuadrature
from scipy.stats import norm
import matplotlib.pyplot as plt
from emukit.quadrature.acquisitions import IntegralVarianceReduction
from emukit.core.optimization import GradientAcquisitionOptimizer
from emukit.core.parameter_space import ParameterSpace
import numpy as np
import GPy
from emukit.model_wrappers.gpy_quadrature_wrappers import BaseGaussianProcessGPy, RBFGPy
from emukit.quadrature.kernels import QuadratureRBFLebesgueMeasure
'''
goal is to find \int f(x) P(x) dx
- f(x) is function to sample
- P(x) is known probability function
'''
define custom regresor class
- this redirects the prediction of the GPy model,
multiplying the predicted f(x) value by P(x)
(unfortunately, this doesn't change the behavior of the BQ method)
class myRegressor:
def init(self, X, Y):
#instantiate GPy model
self.gpy_model = GPy.models.GPRegression(X=X, Y=Y, kernel=GPy.kern.RBF(
input_dim=X.shape[1], lengthscale=1, variance=1.0))
# fit RBF kernel to data
self.gpy_model.constrain_bounded(0.1, 1.)
self.gpy_model.optimize()
# provide connections for emukit to access
self.kern = self.gpy_model.kern
self.Gaussian_noise = self.gpy_model.Gaussian_noise
self.X = self.gpy_model.X
self.posterior = self.gpy_model.posterior
define prediction function, which multiplies means of GPy prediction by P(x)
def predict(self, x, full_cov=False, prob=True, Y_metadata=None, kern=None,
likelihood=None, include_likelihood=True,):
m, cov = self.gpy_model.predict(x, full_cov, Y_metadata, kern, likelihood, include_likelihood)
p = np.atleast_2d(P(x).prod(1)).T
if prob:
m *= p
return m, cov
provide set function for emukit
def set_XY(self, X, Y):
ret = self.gpy_model.set_XY(X, Y)
self.X = self.gpy_model.X
self.posterior = self.gpy_model.posterior
return ret
DIM = 2
PLOT = False
define P(x) and f(x)
'''
specific goal is to find int sum x ** 2 N(x - 1) dx
where
sum sums accross all components of x
N(x) is the normal distribution
'''
def f(x): return np.atleast_2d(np.sum(x ** 2, 1)).T
def P(x): return norm.cdf((x - 1))
emukit requires integral bounds. These were selected to be far away from the signifigant contributions of P(x)
lb, ub = (-10, 10)
sample initial x and f(x) points
np.random.seed(1234)
X_init = np.atleast_2d(np.random.uniform(lb, ub, (DIM, 10))).T
Y_init = f(X_init)
for ii in range(100):
fit GP model
gpy_model = myRegressor(X=X_init, Y=f(X_init))
instantial BQ method
emukit_rbf = RBFGPy(gpy_model.kern)
emukit_qrbf = QuadratureRBFLebesgueMeasure(emukit_rbf, integral_bounds=np.array([[lb, ub] for rr in range(DIM)]))
emukit_model = BaseGaussianProcessGPy(kern=emukit_qrbf, gpy_model=gpy_model)
emukit_method = VanillaBayesianQuadrature(base_gp=emukit_model, X=X_init, Y=Y_init)
ivr_acquisition = IntegralVarianceReduction(emukit_method)
space = ParameterSpace(emukit_method.reasonable_box_bounds.convert_to_list_of_continuous_parameters())
optimizer = GradientAcquisitionOptimizer(space)
x_new,_ = optimizer.optimize(ivr_acquisition)
unfortunatly, this fast computation does not seem available for \int P(x) f(x) dx
#initial_integral_mean, initial_integral_variance = emukit_method.integrate()
optional plotting routine
if PLOT:
NX = 25
x = y = np.linspace(lb, ub, NX)
X, Y = np.meshgrid(x, y)
x_plot = np.array([X, Y]).reshape(2, -1).T
Z, unc = gpy_model.predict(x_plot, prob=False, full_cov=True)
Z = Z.reshape(NX, NX)
S = np.diagonal(unc).reshape(NX, NX)
A = ivr_acquisition.evaluate(x_plot).reshape(NX, NX)
fig, ax = plt.subplots(1, 3, figsize=(6, 3))
ax[0].contourf(X, Y, Z, 1000)
ax[1].contourf(X, Y, S, 1000)
ax[2].contourf(X, Y, A, 1000)
for aa in range(2):
ax[aa].scatter(X_init[:, 0], X_init[:, 1], c='pink')
ax[aa].set_xlim(lb - .1, ub + .1)
ax[aa].set_ylim(lb - .1, ub + .1)
plt.savefig('figs/%i' % ii)
plt.clf()
add new point to x and f(x) samples
X_init = np.append(X_init, x_new, 0)
Y_init = np.append(Y_init, f(x_new), 0)
estimate integral by MC simulation of GP model
estimate = np.mean(gpy_model.predict(np.random.normal(1, 1, (100000, DIM)), prob=False)[0])
print('BQ estimate ', estimate)
truth = np.mean(f(np.random.normal(1, 1, (100000, DIM))))
print('truth ', truth)
$~$
BQ estimate 4.168972307688939
truth 3.9995291930258525