12

I'm looking for a way to generate graphs so that the optimal vertex cover is known. There are no restrictions on the number of nodes or edges, only that the graph is completely connected.

the idea is to generate a graph that's not easy to find the optimal vertex cover, to be able to test different heuristics on it

I found the paper Arthur, J. & Frendeway, J. Generating Travelling-Salesman Problems with Known Optimal Tours, The Journal of the Operational Research Society, Vol. 39, No. 2 (Feb., 1988), pp. 153-159 for generating TSP with known optimal, alas I cannot access it.

Is there a known algorithm?

AndresQ
  • 199
  • 9
  • 6
    "There are no restrictions on the number of nodes or edges, only that the graph is completely connected." You need more restrictions that this. Otherwise, I generate the set of complete graphs and know optimal vertex covers for each one. – Tyson Williams Aug 14 '12 at 12:45
  • 3
    Here is a very simple approach: First, construct a matching $M$. For each edge $e \in M$, add one endpoint to your vertex cover $C$. Then add any number of additional nodes. Then add any number of edges, so that each edge has at least one endpoint in $C$. Necessarily $C$ will be a minimum vertex cover. (Unfortunately, there are lots of graphs that cannot be generated this way: e.g., $K_3$.) – Jukka Suomela Aug 14 '12 at 12:59
  • @JukkaSuomela I've been doing some tests, but I must be doing something wrong. When you say "add one endpoint to your vertex cover" does it mean any endpoint of the edge? Because I tried this and sometimes the cover doesn't cover all edges, and if I add both nodes then it isn't minimal – AndresQ Aug 14 '12 at 15:27
  • 1
    @Shaga, Jukka's construction works, you have a set of independent vertices $I$ with a matching to a set of vertices $C$. You can stick any edges in that maintain $I$'s independence (assuming you have the matching as a base). In this case $|I| = |C|$, you can push it a little further and have $|I|\geq|C|$, as long as $C$ is matched into $I$ (i.e. there's a matching between $C$ and $I$ such that all the vertices in $C$ are matched). – Luke Mathieson Aug 15 '12 at 01:06
  • I'm sorry but I still can't figure it out. On this image is a generated graph that I made with Jukka's method.

    The yellow edges are the matching, and the yellow vertices are the vertex cover, if I add both vertices for each edge of the matching

    However, if I'm not mistaken, the left vertex of the upper right matching edge isn't necessary, and it can be removed from the cover set, hence the cover isn't minimal.

    On the other hand if I add to C only one vertex for each M, then it isn't a proper vertex cover... what am I missing?

    – AndresQ Aug 15 '12 at 15:43
  • 1
    @Shaga: You start with an empty graph $G$. You first construct $M$; you add these edges and their endpoints to $G$. Then you choose your $C$, one endpoint of each edge in $M$; now your $C$ is fixed. Then you can add any number of nodes and edges to $G$, but every time you add an edge, you have to make sure that at least one endpoint of the edge is in $C$. – Jukka Suomela Aug 16 '12 at 15:48
  • 5
    I suppose "generate a random bipartite graph and compute its vertex cover" doesn't count as a useful answer... – David Eppstein Aug 16 '12 at 19:00
  • @JukkaSuomela I understand now you algorithm, and it works, the problem is that the kind of graphs it generates are almost always (I think always) solvable with a simple max degree heuristic. I'm wondering if there's another way to generate harder graphs. If not I'll give you the answer and the bounty :) – AndresQ Aug 17 '12 at 00:12
  • 1
    I doubt max degree works for Jukka's graphs, but I think they can still be solved in polynomial time. Find and remove the edges that cannot be part of a perfect matching (they are all the C-C edges, and possibly some others); then the remaining graph is bipartite and the optimal cover is one side of the bipartition in each of its connected components. To choose which side to use, solve a 2-sat problem with a variable for each component and a constraint for each of the removed edges. – David Eppstein Aug 17 '12 at 01:20
  • 1
    On further searching: Jukka's graphs appear to be exactly the König-Egerváry graphs (graphs in which vertex cover = maximum matching). And polynomial time algorithms for finding vertex covers in these graphs (not necessarily the same as the one I outlined in my comment) have been long known. – David Eppstein Aug 17 '12 at 07:19
  • 2
    there are many strategies for creating "hard" SAT instances & also repositories of archived "hard" instances if you're willing to go that route-- ie converting an instance of SAT to vertex cover. also theres much research studying SAT from an empirical pov which naturally translates into all other NP complete problems eg transition point etc. many refs on all of this... – vzn Aug 17 '12 at 17:57
  • 2
    Even more generally than the polynomial time solvability of Vertex cover on Koning graphs as noted by David, the following result is known from the area of parameterized complexity: there is a constant c such that for every fixed integer k, there is an O(n^c) time algorithm to test whether a graph has a vertex cover that exceeds its maximum matching size by at most k. Konig graphs are the special case when k=0. – Bart Jansen Aug 19 '12 at 12:31

2 Answers2

10

Expanding vzn's comment into an answer: The standard reduction from CNF-SAT to vertex cover is pretty easy: make a vertex for each term (variable or its negation), connect each variable to its negation by an edge, make a clique for each clause, and connect each vertex in the clique to the vertex for one of the terms in the clause. If you start with a satisfiability problem with a known satisfying assignment, this will give you a vertex cover problem with a known optimal solution (choose the term vertices given by the assignment, and in each clause clique choose all but one vertex, so that the clause vertex that is not chosen is adjacent to a term vertex that is chosen).

So now you need to find satisfiability problems that have a known satisfying assignment but where the solution is hard to find. There are many known ways of generating hard satisfiability problems (e.g. generate random k-SAT instances close to the satisfiability threshold) but the extra requirement that you know the satisfying assignment restricts the possibilities. One thing you can do here is go through another level of reduction, from a cryptographically hard problem such as factorization. I.e. choose two large primes p and q, set up a Boolean circuit for multiplying p and q as binary numbers, and translate it into a CNF formula in which there is a variable for each input (p and q) and for each intermediate value on a wire in the circuit, a clause for each output forcing it to have the right value, and a clause for each gate forcing the inputs and outputs of the gate to be consistent with each other. Then translate this CNF formula into vertex cover.

For a simpler strategy, choose the satisfying assignment to a 3CNF formula first, and then generate clauses at random, keeping only the clauses that are consistent with the assignment, and then convert to vertex cover. If the clauses have uniform probability this will be vulnerable to a degree-based heuristic (the term vertices that match the chosen assignment will have lower degree than the term vertices that do not) but this shortcoming can be avoided by adjusting the probabilities of the clauses according to how many of the clause's terms agree with the chosen assignment. Probably this is vulnerable to some kind of polynomial time attack but it might not be one that's natural for vertex cover, so it might make a good set of test instances despite not having much guarantee of hardness.

David Eppstein
  • 50,990
  • 3
  • 170
  • 278
  • 2
    Relying on the hardness of factorisation is a very good idea. I think something along those lines is the best that we can expect if we want an efficient algorithm for generating vertex cover instances. Informally, this requirement forces us to focus on co-NP $\cap$ NP type problems, instead of NP-hard problems. – Jukka Suomela Aug 20 '12 at 22:31
2

closest ref I found was-- On hard instances of approximate vertex cover by Sundar Vishwanathan. did not see refs for looking at hard instances of the exact problem.

as in my comment, there is massive amt of research into this corresponding approach for SAT which is reducible to vertex cover.

re DEs comment, the idea of generating random instances and just choosing those instances that are hard for a standard algorithm seems totally reasonable to me esp with an empirical/experimental research approach[1], its a standard operating procedure for similar research into the SAT transition point.[2]

which by the way does have something to say where the "hard" region is for any other NP complete problem[3,4,5] which relates roughly to a critical point in the "density" of 1s in random instances specified in binary. for vertex cover this would probably correspond to edge density.

note that proving one can construct a set of hard instances, and only hard instances, is basically equivalent to the P vs NP problem. a more formal analysis of this equivalence is in the Razborov/Rudich Natural Proofs paper.

[1] experimental algorithmics

[2] SAT phase transition research

[3] Phase Transitions in NP Hard Problems

[4] Phase transitions in NP-complete problems: a challenge for probability, combinatorics, and computer science by Moore

[5] Phase transition behavior by Walsh

vzn
  • 11,014
  • 2
  • 31
  • 64