40

I came upon wait-for graphs and I wonder, are there any efficient algorithms for detecting if adding an edge to a directed graph results in a cycle?

The graphs in question are mutable (they can have nodes and edges added or removed). And we're not interested in actually knowing an offending cycle, just knowing there is one is enough (to prevent adding an offending edge).

Of course it'd be possible to use an algorithm for computing strongly connected components (such as Tarjan's) to check if the new graph is acyclic or not, but running it again every time an edge is added seems quite inefficient.

Petr
  • 61,569
  • 11
  • 147
  • 305
  • Do you have memory or time constraints in mind? – Mark Elliot Nov 29 '13 at 17:17
  • @MarkElliot The graphs are used to coordinate locking of many concurrently running threads. I expect the graph to have dozens to hundreds of nodes and the number of edges could be quadratic in the number of nodes. The graph will be frequently updated, and all updates will need to check, if they don't cause a cycle. – Petr Nov 29 '13 at 17:45
  • A naive starting approach would be to carry a data structure at each node of nodes which can reach the current node. This would make cycle detection cheap on addition and update, but add complexity to remove. I'm wondering if there's a data structure better suited to tracking this (an adjacency matrix?), but am still thinking through stuff. – Mark Elliot Nov 29 '13 at 17:53
  • From the description it sounds like edge removal will be about as frequent as adding, which pretty much rules out any data caching since removal invalidates everything (or at least I don't see how you can avoid this). My guess, therefore, is that the simplest solution (when adding edge `A`->`B`, flood fill from `B` and check that `A` is not reachable) is going to be the most efficient. –  Nov 29 '13 at 18:04
  • If the graph is from a "wait-for graph" then nodes can only be removed when something completes. So deletion only occurs at the leaf nodes. Am I correct? – Jacob Dec 06 '13 at 14:53
  • 2
    @PetrPudlák can you evolve what you're missing from Thomas's answer? I consider this question answered. You're looking to detect cycles at time of insertion? – user Dec 06 '13 at 16:00
  • @Jacob That is not necessarily true. A node can be waiting for another node with a timeout, and when it elapses, the node is removed. – Petr Dec 06 '13 at 17:21

4 Answers4

42

If I understood your question correctly, then a new edge (u,v) is only inserted if there was no path from v to u before (i.e., if (u,v) does not create a cycle). Thus, your graph is always a DAG (directed acyclic graph). Using Tarjan's Algorithm to detect strongly connected components (http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm) sounds like an overkill in this case. Before inserting (u,v), all you have to check is whether there is a directed path from v to u, which can be done with a simple BFS/DFS.

So the simplest way of doing it is the following (n = |V|, m = |E|):

  • Inserting (u,v): Check whether there is a path from v to u (BFS/DFS). Time complexity: O(m)
  • Deleting edges: Simply remove them from the graph. Time complexity: O(1)

Although inserting (u,v) takes O(m) time in the worst case, it is probably pretty fast in your situation. When doing the BFS/DFS starting from v to check whether u is reachable, you only visit vertices that are reachable from v. I would guess that in your setting the graph is pretty sparse and that the number of vertices reachable by another is not that high.

However, if you want to improve the theoretical running time, here are some hints (mostly showing that this will not be very easy). Assume we aim for testing in O(1) time whether there exists a directed path from v to u. The keyword in this context is the transitive closure of a DAG (i.e., a graph that contains an edge (u, v) if and only if there is a directed path from u to v in the DAG). Unfortunately, maintaining the transitive closure in a dynamic setting seems to be not that simple. There are several papers considering this problem and all papers I found were STOC or FOCS papers, which indicates that they are very involved. The newest (and fastest) result I found is in the paper Dynamic Transitive Closure via Dynamic Matrix Inverse by Sankowski (http://dl.acm.org/citation.cfm?id=1033207).

Even if you are willing to understand one of those dynamic transitive closure algorithms (or even want to implement it), they will not give you any speed up for the following reason. These algorithms are designed for the situation, where you have a lot of connectivity queries (which then can be performed in O(1) time) and only few changes in the graph. The goal then is to make these changes cheaper than recomputing the transitive closure. However, this update is still slower that a single check for connectivity. Thus, if you need to do an update on every connectivity query, it is better to use the simple approach mentioned above.

So why do I mention this approach of maintaining the transitive closure if it does not fit your needs? Well, it shows that searching an algorithm consuming only O(1) query time does probably not lead you to a solution faster than the simple one using BFS/DFS. What you could try is to get a query time that is faster than O(m) but worse than O(1), while updates are also faster than O(m). This is a very interesting problem, but it sounds to me like a very ambitious goal (so maybe do not spend too much time on trying to achieve it..).

Thomas
  • 760
  • 5
  • 9
5

As Mark suggested it is possible to use data structure that stores connected nodes. It is the best to use boolean matrix |V|x|V|. Values can be initialized with Floyd–Warshall algorithm. That is done in O(|V|^3).

Let T(i) be set of vertices that have path to vertex i, and F(j) set of vertices where exists path from vertex j. First are true's in i'th row and second true's in j'th column.

Adding an edge (i,j) is simple operation. If i and j wasn't connected before, than for each a from T(i) and each b from F(j) set matrix element (a,b) to true. But operation isn't cheap. In worst case it is O(|V|^2). That is in case of directed line, and adding edge from end to start vertex makes all vertices connected to all other vertices.

Removing an edge (i,j) is not so simple, but not more expensive operation in the worst case :-) If there is a path from i to j after removing edge, than nothing changes. That is checked with Dijkstra, less than O(|V|^2). Vertices that are not connected any more are (a,b):

  • a in T(i) - i - T(j),
  • b in F(j) + j

Only T(j) is changed with removing edge (i,j), so it has to be recalculated. That is done by any kind of graph traversing (BFS, DFS), by going in opposite edge direction from vertex j. That is done in less then O(|V|^2). Since setting of matrix element is in worst case is again O(|V|^2), this operation has same worst case complexity as adding edge.

Ante
  • 5,230
  • 6
  • 22
  • 46
0

If all previous jobs are in Topologically sorted order. Then if you add an edge that appears to brake the sort, and can not be fixed, then you have a cycle.

https://stackoverflow.com/a/261621/831850

So if we have a sorted list of nodes:

1, 2, 3, ..., x, ..., z, ...
Such that each node is waiting for nodes to its left.

Say we want to add an edge from x->z. Well that appears to brake the sort. So we can move the node at x to position z+1 which will fix the sort iif none of the nodes (x, z] have an edge to the node at x.

Community
  • 1
  • 1
Jacob
  • 1,313
  • 13
  • 28
0

If the graph is directed, you would only have to check the parent nodes (navigate up until you reach the root) of the node where the new edge should start. If one of the parent nodes is equal to the end of the edge, adding the edge would create a cycle.

  • 3
    The graph need not be a tree. The root may be reachable through different ancestors, and you would have to check all of them. Not that this is particularly costly, but it *is* a potentially O(|E|) solution -- and exactly what @Thomas is proposing. – tucuxi Dec 06 '13 at 16:49