16

$X$ is a discrete random variable taking value $x_n$ with probability $1/N$ for $n=1, \ldots,N$. I would like to set the $x_n$ values in an optimization problem. My objective is to minimize the variance while satisfying a set of constraints.

So the problem is:

\begin{array}{ll} & \min\limits_{\{x_n\}_{n=1}^N}{\operatorname{Var}(X)} \\ & \text{s.t.} \ \ldots \end{array}

Denote $x = \begin{pmatrix} x_1 & \ldots & x_n\end{pmatrix}^\top$. So in this discrete distribution, the variance is: $$\operatorname E[X^2] -\operatorname E[X]^2 = \frac{1}{N} \sum_{n=1}^N x_n^2 - \frac{1}{N^2} \left( \sum_{n=1}^N x_n \right)^2 = \frac{x^\top x}{N} - \frac{(\mathbf{e}^\top x)(x^\top \mathbf{e})}{N^2}.$$

My questions are:

  1. The variance is not convex, so minimization is hard, right? Is there a common method for this?
  2. Is there any convex function which results in a low variance after being minimized?

Edit: The variance may be convex based on the comments. We can show that $$\operatorname E[X^2] -\operatorname E[X]^2 = \frac{x^\top \left( I - \tfrac{\mathbf{e}\mathbf{e}^\top}{N}\right) x}{N},$$ so to show this is convex, we need to show that $I - \tfrac{\mathbf{e}\mathbf{e}^\top}{N}$ is a p.s.d. matrix. That is itself a challenge. I need to see the proof of how this is p.s.d, or if one can properly write down how the variance can be shown as a closed-form norm expression as in the comments.

To show the variance is $\ell^2$-norm-representable we need to have $a^\top a = I - \tfrac{\mathbf{e}\mathbf{e}^\top}{N}$ for some $a$. Finding $a$ will also make it.

Edit-2: Ok, I got the comment. He follows the $\operatorname{Var}(X) = \operatorname E[(X - \operatorname E[X])^2]$ approach, then it is obvious.

independentvariable
  • 3,980
  • 10
  • 36
  • 5
    To me it looks like $\operatorname V(x)$ is a convex quadratic function, since $0\le\operatorname V(x) = x'Hx$ where $H$ is the Hessian of $\operatorname V(x)$. – ErlingMOSEK Sep 11 '19 at 17:46
  • 5
    Also $\operatorname V(x)=\frac1N\left|x-ee^\top x/N\right|^2$ which clearly is convex. – ErlingMOSEK Sep 11 '19 at 17:54
  • 5
    Minimizing $\left|x-ee^\top x/N\right|$ is easy using SOCP/conic quadratic optimization. See https://docs.mosek.com/modeling-cookbook/index.html – ErlingMOSEK Sep 11 '19 at 18:04
  • @ErlingMOSEK Is it $\operatorname V(X) =\frac1N \left|x - \frac{ee^\top x}N\right|_2^2$ or $\operatorname V(X) = \frac1N\left|\frac{x - ee^\top x}N \right|_2^2$ ? Because I can not derive this. Thanks for the answer! – independentvariable Sep 11 '19 at 20:00
  • @ErlingMOSEK I mean, I can see that $V(x) = \frac{x^\top\left(I - \frac{ee^\top}{N}\right)x}{N}$, but from there, how can we get a proper norm? – independentvariable Sep 11 '19 at 20:38
  • 2
    To show why $A = I - \left( \frac{e e^T}{n}\right)$ is psd, show that $v^T A v \geq 0 , \forall v$, where $v \in \mathbb{R}^n$. An inequality that will come in handy is $n\left(\sum\limits_{i=1}^{n} v_i^2\right) \geq \left(\sum\limits_{i=1}^{n} v_i \right)^2 $, where $v_i$ is the the $i^{th}$ component of $v$. – batwing Sep 11 '19 at 22:41

1 Answers1

15

It holds $$ \begin{array}{rcl} \operatorname V(x) &= &\dfrac1N\left\| x-\dfrac{e^\top x}{N} e \right\|^2 \\ & = & \dfrac1N\left(x^\top x+\dfrac{(e^\top x)^2 e^\top e}{N^2}-2\dfrac{(e^\top x)^2}N\right) \\ & = & \dfrac{x^\top x}{N} - \dfrac{(e^\top x)^2}{N^2}. \end{array} $$ So you are minimizing the $\ell^2$-norm of an affine expression which is known to be convex.

The problem $$ \begin{array}{lcl} \min & \dfrac{\|x-e u\|}{N} & \\ \mbox{s.t.} & \dfrac{e^\top x}{N} - u & = & 0 \\ \end{array} $$ provides a nice interpretation since $u$ is the average. Note the problem tries to make all the $x$ equal to the average value.

Alternatively the last problem can be stated as $$ \begin{array}{lcl} \min & \dfrac{s}{N}&\\ \mbox{s.t.} & \dfrac{e^\top x}{N} - u &=0 \\ &(s;x-e u) &\in Q. \\ \end{array} $$ where $Q$ is a quadratic cone. This provides another convexity proof because the quadratic cone is convex. Hence, the problem can be solved using SOCP also known as conic quadratic optimization.

ErlingMOSEK
  • 3,166
  • 10
  • 21