27

Let $L_1,L_2$ be two regular languages given by NFAs $M_1,M_2$ as input.

Assume we would like to check whether $L_1\cap L_2\neq \emptyset$. This can clearly be done by a quadratic algorithm which computes the product automaton of $M_1,M_2$, but I was wondering if anything more efficient is known.

Is there a $o(n^2)$ algorithm for deciding whether $L_1\cap L_2\neq \emptyset$? What is the fastest known algorithm?

Juho
  • 3,170
  • 2
  • 26
  • 36
R B
  • 9,448
  • 5
  • 34
  • 74
  • 5
    If anyone has good ideas please let me know, but currently it's an open problem. If you could solve this problem in say linear time, then basically any problem solvable by a non-deterministic machine using only n bits of memory could be solved deterministically in $2^{\frac{n}{2}}$ time. – Michael Wehar Jan 14 '15 at 17:05
  • 1
    @MichaelWehar - is linear time algorithm an open question or any sub-quadratic algorithm? – R B Jan 14 '15 at 17:10
  • 6
    I think it's still open for any sub-quadratic. Some more info: https://rjlipton.wordpress.com/2009/08/17/on-the-intersection-of-finite-automata/ – Michael Wehar Jan 14 '15 at 17:13
  • 3
    I work on this problem. If you're interested, we could chat sometime. Just send me an email. :) – Michael Wehar Jan 14 '15 at 17:16
  • 4
    So for example (based on Michael's comment), the strong exponential time hypothesis implies that the exponent should be 2. I think this could also proved to follow from the exponential time hypothesis... – Ryan Williams Jan 15 '15 at 05:25
  • 2
    this is a more recent blog by rjlipton citing MWs newer/ tighter results (building on lipton's own). that award-winning paper apparently is the best current known results/ implications & is essentially an answer to the question. essentially a tighter bound proves P≠NL! this also seems to be yet another case of proving lower bounds by proving upper bounds – vzn Jan 15 '15 at 16:51
  • 4
    RB: Assume you can test for emptiness of two DFAs of size $n$ in time $n^{1+\varepsilon}$ with $\varepsilon < 1$. Now, if you have $k$ DFAs of size $n$, you can build the product of the first $k/2$ DFAs and of the remaining $k/2$ DFAs. Then, test for emptiness in time $(n^{k/2})^{1+\varepsilon} = n^{\frac{1}{2}k + \frac{\varepsilon}{2}k}$ which is better than $n^k$.

    vzn: This award-winning paper was written by @MichaelWehar who commented in this thread. Michael, perhaps you could submit an answer, if you have time!

    – Michael Blondin Jan 15 '15 at 19:18
  • Thanks for the kind comments. Ok, I will write an answer. I'm working on it and should finish a little later tonight. – Michael Wehar Jan 17 '15 at 00:15
  • 4
    @RyanWilliams Hi Ryan, what leads you to think that the weaker exponential time hypothesis implies that we can't solve intersection non-emptiness for two DFA's more quickly? Someone else suggested this to me once too. :) – Michael Wehar Jan 17 '15 at 03:31

1 Answers1

26

Simple answer: If there does exist a more efficient algorithm that runs in $O(n^{\delta})$ time for some $\delta < 2$, then the strong exponential time hypothesis would be refuted.


We will prove a stronger theorem and then the simple answer will follow.

Theorem: If we can solve the intersection non-emptiness problem for two DFA's in $O(n^{\delta})$ time, then any problem that's non-deterministically solvable using only n bits of memory is deterministically solvable in $poly(n)\cdot2^{(\delta n/2)}$ time.

Justification: Suppose that we can solve intersection non-emptiness for two DFA's in $O(n^{\delta})$ time. Let a non-deterministic Turing machine M with a read only input tape and a read/write binary work tape be given. Let an input string x of length n be given. Suppose that M doesn't access more than n bits of memory on the binary work tape.

A computation of M on input x can be represented by a finite list of configurations. Each configuration consists of a state, a position on the input tape, a position on the work tape, and up to n bits of memory that represent the work tape.

Now, consider that the work tape was split in half. In other words, we have a left section of $\frac{n}{2}$ cells and a right section of $\frac{n}{2}$ cells. Each configuration can be broken up into a left piece and a right piece. The left piece consists of the state, the position on the input tape, the position on the work tape, and the $\frac{n}{2}$ bits from the left section. The right piece consists of the state, the position on the input tape, the position on the work tape, and the $\frac{n}{2}$ bits from the right section.

Now, we build a DFA $D_1$ whose states are left pieces and a DFA $D_2$ whose states are right pieces. The alphabet characters are instructions that say which state to go to, how the tape heads should move, and how the work tape's active cell should be manipulated.

The idea is that $D_1$ and $D_2$ read in a list of instructions corresponding to a computation of M on input x and together verify that it is valid and accepting. Both $D_1$ and $D_2$ will always agree on where the tape heads are because that information is included in their input characters. Therefore, we can have $D_1$ verify that the instruction is appropriate when the work tape position is in the left piece and $D_2$ verify when in the right piece.

In total, there are at most $poly(n) \cdot 2^{n/2}$ states for each DFA and at most $poly(n)$ distinct alphabet characters.

By the initial assumption, it follows that we can solve intersection non-emptiness for the two DFA's in $poly(n) \cdot 2^{(\delta n /2)}$ time.

You might find this helpful: https://rjlipton.wordpress.com/2009/08/17/on-the-intersection-of-finite-automata/


CNF-SAT is solvable using $k+O(\log(n))$ bits of memory where k is the number of variables. The preceding construction can be used to show that if we can solve intersection non-emptiness for two DFA's in $O(n^{\delta})$ time, then we can solve CNF-SAT in $poly(n) \cdot 2^{(\delta k/2)}$ time. Therefore, the simple answer holds.

Comments, corrections, suggestions, and questions are welcomed. :)

Michael Wehar
  • 5,127
  • 1
  • 24
  • 54
  • 2
    all great but the original question above was for NFAs; does it all still follow? also, some key detail is left out. seems worth a paper. or is it all in your published one? if so plz cite it/ tie in to numbered theorems there etc. also it would be helpful to state this all more in terms of std complexity classes L,P,NP,PSpace,ExpTime etc if possible. also wondering if anything can be said about the direction "strong" 2-way DFA intersection emptiness lower bound ($\Omega(n^2)$?) → SETH...? what needs to be shown for that? – vzn Nov 05 '15 at 05:57
  • 2
    Hi VZN, the intersection problem for NFA's appears slightly harder than the intersection problem for DFA's. However, this is not the case because there is a parameterized reduction from intersection non-emptiness for $k$ NFA's to intersection non-emptiness for $k$ DFA's. This is mentioned in my first paper. – Michael Wehar Nov 05 '15 at 16:34
  • 1
    Maybe you can describe in more detail what you mean when you mention 2-way DFA. The non-emptiness problem for $(2k-2)$-turn two-way DFA's is just as hard as the intersection non-emptiness problem for $k$ one-way DFA's. Link: http://cs.stackexchange.com/questions/13456/what-is-the-complexity-of-the-emptiness-problem-for-2-way-dfas/48560#48560 – Michael Wehar Nov 05 '15 at 16:58
  • 1
    In terms of standard complexity classes, you could state this as something like: $NSpace(2 \cdot \log(n)) \subseteq DTime(n)$ implies SETH is false. When I write $NSpace(2 \cdot \log(n))$, I mean $2\cdot \log(n)$ space for non-deterministic Turing machines with a two-way read only input tape and a two-way read/write binary work tape. – Michael Wehar Nov 05 '15 at 17:01
  • 1
    @MichaelWehar 1. If we can solve intersection of $k$ DFA's in $O(nk)$ time what is the consequence? 2. If we can solve intersection of $k$ DFA's in $O(n^{f(k)})$ time what is the consequence where $f=o(k)$ (might be logarithmic or $\sqrt k$ etc.)? – Turbo Mar 12 '21 at 03:51
  • 1
    @1.. Hi there! Thank you for your questions. I assume that you are considering that $k$ can vary with the input. Regarding (1), this would mean that $PTIME = PSPACE$ because the intersection non-emptiness problem is $PSPACE$-complete (Kozen 1977). Maybe we could prove additional implications too besides just $PTIME = PSPACE$. Regarding (2), this would imply that $PTIME \neq NLOGSPACE$ (see my paper from 2014). This would also imply many additional things so you might want to look at this recent paper too: https://link.springer.com/chapter/10.1007/978-3-030-48516-0_6 – Michael Wehar Mar 12 '21 at 18:35
  • 1
    @1.. I actively work on this problem and would always be excited to chat with you about it. Send me an email anytime. Thank you and I hope that you have a nice day! :) – Michael Wehar Mar 12 '21 at 18:38
  • 1
    @MichaelWehar Thank you $\mathsf{PTIME}\neq\mathsf{NL}$ through $n^{o(k)}$ is surprising and counterintuitive to me. 1.) I think I am looking for an assumption consistent to $\mathsf{CH}\neq\mathsf{PSPACE}$ but $\mathsf{CH}=\mathsf{L}$ or possibly $\mathsf{CH}=\mathsf{TC}^0$. For the working hypothesis which is the strongest result on $k$ DFA one can get? 2.) What if $k$ DFA problem is partially fixed parameter tractable which is $f(k)\mathsf{poly}(n)$ algorithm exists where $f(k)$ is superpolynomial or superexponential? – Turbo Mar 12 '21 at 22:02
  • 1
    @MichaelWehar I posted a complete question in https://cstheory.stackexchange.com/questions/48573/dfsa-and-nfsa-intersection-problem. – Turbo Mar 13 '21 at 02:50
  • @1.. I am still thinking about your question, but for the second part, we still get $PTIME \neq NLOGSPACE$ even if it runs in $f(k) \cdot n^{o(k)}$. Based on the paper, we can actually prove something even stronger regarding the fixed slices. Say that $k$-intersection is the $k$th slice, then the result says the following. If for all $\alpha > 0$, there exists $k$ such that $k$-intersection $\in DTIME(n^{\alpha \cdot k})$, then $PTIME \neq NLOGSPACE$. – Michael Wehar Mar 13 '21 at 23:18