When solving best subset regression as a mixed integer program, how do you decide how tightly to bound the range of values of the $X$ values? When the box is tight, the solver finds a solution quickly, but potential solutions are ignored. How quickly should I expand the box?
I built my constraints to estimate variance using the following approach:
Rather than apply cuts on the covariance matrix directly, the cuts are applied on the principal components.
I want to minimize the variance of the sum of a vector $X$ ($1\times N$):
$$\text{Variance} = X \Sigma X^\top= X(Q\Lambda Q^\top)X^\top.$$
Let
$E_i:$ Eigenvalue $i$ ($1\times1$)
$V_i:$ Eigenvector $i$ ($N\times1$)
$= \sum_{i} E_i (XV_i)^2 = \sum_{i}\text{ComponentContribution}_i$
Near a reference vector $X_0$:
$$\text{ComponentContribution}_i \ge E_i (X_0V_i)^2+ 2E_i (X_0V_i)(XV_i-X_0V_i).$$
Note that there are many potential $X_0$ that have an equal value for $X_0V_i$. The above estimate for the component's contribution to variance does not depend on $X_0$. It only depends on $X_0V_i$. This allows the approximation to be useful near many different $X_0$. It is an approximation near a load value. Substituting:
\begin{align}L_0 &= X_0V_i\quad(1\times1)\\\text{ComponentContribution}_i &\ge E_iL^2 + 2E_iL(XV_i - L)\\\text{ComponentContribution}_i &\ge E_i(2LXV_i - L^2).\end{align}
This allows each component's contribution to be estimated independently. One benefit is that more constraints (more $L$ values) can be made for estimating the first component than for the smallest.
However, this leads to a second question, when evaluating the sources of error for an intermediate solution, how should I decide which components are contributing enough to error in the estimate to justify adding a constraint? Which should be lumped together into a single constraint?
I posted a working test to github.