10

I am wondering if the following problem has a name, or any results related to it.

Let $G = (V,w)$ be a weighted graph where $w(u,v)$ denotes the weight of the edge between $u$ and $v$, and for all $u,v \in V$, $w(u,v) \in [-1,1]$. The problem is to find a subset of vertices that maximizes the sum of the weights of the edges adjacent to them: $$\max_{S \subseteq V} \sum_{(u,v) : u \in S\ \textrm{or}\ v\in S} w(u,v)$$ Note that I am counting edges both that are inside the subset and that are outside the subset, which is what distinguishes this problem from max-cut. However, even if both $u$ and $v$ are in $S$, I only want to count the edge $(u,v)$ once (rather than twice), which is what distinguishes the objective from merely being the sum of the degrees.

Note that the problem is trivial if all edge weights are non-negative -- simply take the whole graph!

Andrew D. King
  • 1,573
  • 10
  • 19
Aaron Roth
  • 9,870
  • 1
  • 43
  • 68
  • Your definition doesn't match your note later on about not counting duplicate edges. Are you summing over ordered pairs or 2-element subsets ? (the latter would give you the property you need, I think) – Suresh Venkat Apr 28 '11 at 21:28
  • 1
    Another note: the only edge weights NOT counted are those inside V \ S. Are you interested in hardness results or approximations, because in the former case, minimizing the sum of edge weights inside S' = V \ S might be the more natural problem. – Suresh Venkat Apr 28 '11 at 21:31
  • @Suresh: The formal definition in the question is correct as long as approximation ratio is concerned. It just gives exactly the twice of what one expects from the words. – Tsuyoshi Ito Apr 28 '11 at 22:08
  • @TsuyoshiIto: oh I see, because edges across the cut are also counted twice. – Suresh Venkat Apr 28 '11 at 22:09
  • Yes, I mean to sum over ordered pairs of elements $(u,v)$, counting each distinct edge $e = $(u,v)$ only once in the sum. I am more interested in approximations, but hardness results would also be enlightening. – Aaron Roth Apr 28 '11 at 22:20
  • 1
    The exact problem is NP-hard because, as Suresh wrote in his comment, the problem is equivalent to the unrestricted {0,1} quadratic programming, which is NP-hard. – Tsuyoshi Ito May 01 '11 at 13:49
  • Related to Suresh and Tsuyoshi's comments: the problem can be written as $\max_{x \in {0, 1}^n}{e^TAe - x^TAx}$, where $e = (1, \ldots, 1)$ and $A$ is the weighted adjacency of the graph. So the problem is trivial when $A$ is PSD (not just when $A$ has all entries positive): in that case $x = (0, \ldots, 0)$ is optimal (equivalent to $S = V$). Also notice that if you can find $x \in [0, 1]^n$ then you can round the vector without losing anything. I would try the case when $e^TAe \geq 0$ and $A$ is negative semidefinite (so you need to solve $\max_{x \in [0, 1]^n}{|Ux|^2}$, $U^TU = -A$). – Sasho Nikolov May 02 '11 at 15:37

2 Answers2

6

FWIW, your problem is hard to approximate within a multiplicative factor of $n^{1-\epsilon}$ for any $\epsilon>0$.

We show that below by giving an approximation-preserving reduction from Independent Set, for which the hardness of approximation is known.

Reduction from Independent Set

Let undirected graph $G=(V,E)$ be an instance of Independent Set. Let $d_v$ denote the degree of vertex $v$ in $G$. Let $n$ be the number of vertices in $G$.

Construct edge-weighted graph $G'=(V',E')$ from $G$ as follows. Give each edge in $E$ weight 1. For each non-isolated vertex $v\in V$, add $d_v-1$ new edges, each with weight $-1$, to $d_v-1$ new vertices. For each isolated vertex $v\in V$, add one new edge of weight 1 to a new vertex.

(Note: each new vertex (in $G'$ but not $G$) has exactly one neighbor, which is in $G$.)

Lemma. $G$ has an independent set of size $k$ iff $G'$ (as an instance of your problem) has a solution of value at least $k$.

Proof. Let $S$ be any independent set in $G$. Then, since the vertices in $S$ are independent in $G'$, the value of $S$ in $G'$ (by your objective) is $$\sum_{v\in S} d_v - (d_v-1) ~=~ |S|.$$

Conversely, let $S$ be a solution for $G'$ of value at least $k$. Without loss of generality, assume $S$ contains no new vertices. (Each new vertex $v'$ is on a single edge $(v',v)$. If $v$ was not isolated in $G$, then the weight of the edge is $-1$, so removing $v'$ from $S$ increases the value of $S$. If $v$ was isolated, then the weight of the edge is 1, so removing $v'$ from $S$ and adding $v$ maintains the value of $S$.)

Without loss of generality, assume that $S$ is an independent set in $G$. (Otherwise, let $(u,v)$ be an edge such that $u$ and $v$ are in $S$. The total weight of $v$'s incident edges in $G'$ is $d_v - (d_v-1) = 1$, so the total weight of $v$'s incident edges other than $(u,v)$ is at most zero. Thus, removing $v$ from $S$ would not increase the value of $S$.)

Now, by the same calculation as at the start of the proof, the value of $S$ is $|S|$. It follows that $|S| \ge k$. QED

As an aside, you might ask instead for an additive approximation, of, say, $O(n)$ or $\epsilon m$.

It seems possible to me that for your problem even deciding whether there is a positive-value solution could be NP-hard.

Neal Young
  • 10,546
  • 33
  • 60
3

Not really a solution but some observations.

This is a special case of the following problem: given a universe $U = \{1, \ldots, m\}$, and a collection of sets $S_1, \ldots, S_n \subseteq U$, and a weight function $w:U \rightarrow [-1, 1]$, find the set $I \subseteq [n]$ such that $w(\bigcup_{i \in I}{S_i})$ is maximized (the weight of a set is the total weight of its elements). Your problem corresponds to the case where each element of $U$ appears in exactly two sets (but I am not sure how to exploit this restriction, although it might help).

This is a coverage problem: similar to Max-k-Set-Cover, but without the restriction to use $k$ sets and with negative weights allowed. The greedy approximation of Max-k-Set-Cover (at each step output the set that has the largest weight of uncovered elements so far) outputs a sequence of sets such that the first $k$ sets are a $1 + 1/e$ approximation to the optimum (so this is a simultaneous approximation for all $k$). Unfortunately, as usual, there is a problem with analyzing it when weights could be negative. The basic observation of the analysis of the greedy algorithm is that if $S_1$ is the first set that is output, then $w(S_1) \geq OPT_k/k$ ($OPT_k$ being the maximum weight covered by $k$ sets), because $OPT_k$ is less than the sum of the weights of the $k$ sets in the optimal solution, and each of them has weight less than $w(S_1)$. However, with negative weights it's no longer true that $OPT_k$ is less than the sum of the weights in the optimal solution. In general, a union bound is no longer true.

Sasho Nikolov
  • 18,189
  • 2
  • 67
  • 115