8

Consider the following optimization problem: \begin{equation} \label{eq:1} \min_{x\in\mathcal X} \max_{i\in\mathcal I}\sum_{j\in\mathcal J} f_i(x_{(j)}), \end{equation} where $\mathcal{I}$ and $\mathcal{J}$ are discrete and finite sets, $\mathcal X\subset \mathbb{R}^N$ is a compact set, $(f_i)_{i\in\mathcal J}$ are convex and differentiable functions, and $x_{(j)}$ is a subvector of the global variable $x$. Note that $x_{(j)}$ and $x_{(j')}$ for $j,j'\in\mathcal J$ may overlap.

For example $x = (x_1,x_2,x_3)$, $x_{(1)} = (x_1,x_2)$ and $x_{(2)} = (x_2,x_3)$.

I am investigating whether there is a way to solve this problem in a distributed manner. Note that the problem, without the term "$\max_{i}$", is known as the general form consensus problem and is solved nicely by ADMM (see Boyd et al. Sec. 7.2).

Unfortunately, the term "$\max_{i}$" seemingly makes the problem non-separable, and the active index, i.e., the index $i^\star(x)\in\mathcal I$ which attains the maximum, cannot be computed in a decentralized way.

Is there an algorithm to separate the variable update steps for the problem above and apply an ADMM-like (or similar) algorithm in a decentralized way over $j\in\mathcal J$?

Update: I have found this monograph (Sec. 7.5) in which there is a problem of the type $$ \min_{x\in\mathcal X} \varphi(f_1(x),\dots,f_J(x)), $$ where $\varphi(\cdot)$ is a convex non-decreasing function.

Then they formulate the problem in epigraph form and propose a method to solve it. Any idea on whether this method will work in my case?

Apprentice
  • 215
  • 1
  • 9

1 Answers1

2

The problem could be written as $$ \begin{array}{rll} \mbox{minimize} &u\\ \mbox{subject to} & u \geq \sum_{j\in J}f_i(x_{(j)}) &i\in I \end{array} $$ which is much more similar to the consensus form. Introducing $u_i$ and $x_i$, we got $$ \begin{array}{rll} \mbox{minimize} &u\\ \mbox{subject to} & u_i \geq \sum_{j\in J}f_i(x_{i(j)}) &i\in I\\ &u_i=u &i\in I\\ &x_i=x &i\in I \end{array} $$ Then we are ready to use ADMM. The argumented Lagrange is $$ L(x,x_i,u,u_i,y_i,v_i)=u+\sum_i\left[v_i(u_i-u)+\rho/2(u-u_i)^2+y_i^T(x_i-x)+\rho/2\|x-x_i\|_2^2+\mathbb{I}\left(u_i\geq \sum_{j\in J}f_i(x_{i(j)})\right)\right] $$

The ADMM algorithm is $$ \begin{align} u_i^{k+1},x_i^{k+1}&= \arg\min\{v^k_i(u_i-u^k)+\rho/2(u^k-u_i)^2+y_i^{kT}(x_i-x^k)+\rho/2\|x^k-x_i\|_2^2,\;\mathrm{s.t.}\; u_i\geq \sum_{j\in J}f_i(x_{i(j)})\}\\ u_{k+1}&=\frac{\sum_i(\rho u_i^{k+1}+v_i^k)-1}{N\rho}\\ x_{k+1}&=\frac 1 N \sum_i(x_i^{k+1}+(1/\rho)y_i^k)\\ v_i^{k+1}&=v_i^k+\rho(u_i^{k+1}-u^{k+1})\\ y_i^{k+1}&=y_i^k+\rho(x_i^{k+1}-x^{k+1}) \end{align} $$

xd y
  • 1,186
  • 3
  • 9
  • Thanks for your answer @xd y. I noticed however that the update of the $x_k$ should not involve all the components of x but only the ones that are shared among the other clusters (similarly to the reference Boyd et al. Sec. 7.2 mentioned in the question) – Apprentice Mar 09 '22 at 12:33
  • Also, what do you mean by $x_{i(j)}$? The problem should be distributed over $j\in\mathcal J$ and not $i\in \mathcal I$ since those are the nodes involved are $j\in\mathcal J$. – Apprentice Mar 09 '22 at 17:55
  • Also the function $\mathbb I$ is the function that is infinity when the argument is false and zero when it's true? – Apprentice Mar 09 '22 at 19:23
  • $x_i$ is the "local variable", and by $x_{i(j)}$, I mean the $j$'s subvector of the local variable $x_i$. My answer is a algorithm distributed over $i\in\mathcal{I}$, so I think I've misunderstood your question. The $\mathbb{I}$ function does mean that. Sorry for the unstandard notations and poor language. – xd y Mar 10 '22 at 02:00
  • Still I don't get whar you mean by "j's subvector of the local variable". Can you explain it with an example? – Apprentice Mar 10 '22 at 07:28
  • For example $x=(x_1,x_2,x_3)$, $x_{(1)}=(x_1,x_2)$ and $x_{(2)}=(x_2,x_3)$. For all $i\in\mathcal{I}$, there is a decision variable $x_i \in \mathbb{R}^3$, let's say $x_i=(a_i,b_i,c_i)$, and by $x_{i(1)}$ I mean $(a_i,b_i)$, by $x_{i(2)}$ I mean $(b_i, c_i)$. Again, my answer is an algorithm distributed on $i\in\mathcal{I}$, which may not be what you want. – xd y Mar 13 '22 at 09:40