2

I have a $m$ by $n$ matrix $X$ of binary variables in my MIP which represents a list of $m$ items each belonging to one of $n$ categories. $m$ is usually around $1,000$ while $n$ is much lower at around $10$. If $X_{i,j}=1$, then it means that item $i$ belongs to category $j$ (and 0 would mean that the item doesn't belong to that category).

In addition to this, I have a list of costs $w_{i,j}$ which contribute to a total cost $C$ if and only if item $i$ belongs to the same category as item $j$. For example, if $w_{300,400}=999$, then $C$ will be increased by 999 if items 300 and 400 belong to the same category (regardless of what category it is). Even though there are potentially up to a million possible pairs, the number of pairs in $w$ is usually at most around $10,000$.

What's the best way to represent this total cost $C$ in my MIP? My mathematical intuition would be to take the dot product of $X_i$ and $X_j$ and multiply by $w_{i,j}$ for all the pairs, but multiplying two variables is invalid in MIP.

RobPratt
  • 32,006
  • 1
  • 44
  • 84

1 Answers1

5

The general approach for linearizing a product of binary variables is here: How to linearize the product of two binary variables?

But you can exploit some special structure in your setting and avoid most products. For each pair of items $i$ and $j$ (with $i<j$) that have positive cost $w_{ij}$, introduce binary (or nonnegative) variable $Y_{ij}$ to represent the inner product $\sum_k X_{ik}X_{jk}$. The problem is to minimize $\sum_{i<j} w_{ij} Y_{ij}$ subject to \begin{align} \sum_k X_{ik} &= 1 &&\text{for all $i$}\\ X_{ik} + X_{jk} - 1 &\le Y_{ij} &&\text{for all $k$ and all $i<j$ with $w_{ij}>0$} \end{align} Contrast this with the usual linearization that would instead introduce four-indexed variables $Y_{ikj\ell}$ to represent the product $X_{ik} X_{j\ell}$.

RobPratt
  • 32,006
  • 1
  • 44
  • 84