Lower bound for NFA accepting 3 letter language

Question

Related to a recent question (Bounds on the size of the smallest NFA for L_k-distinct) Noam Nisan asked for a method to give a better lower bound for the size of an NFA than what we get from communication complexity bounds. What follows is a special version of that problem.

Suppose $L$ is a language over some $n$ letter alphabet whose words all have length $3$. Denote the size of the smallest NFA accepting $L$ by $NFA(L)$. Define an $n\times n^2$ matrix $M$ as $M(a;bc)=1$ if $abc\in L$ and otherwise $0$. Denote the minimal number of $1$-submatrices (submatrix that contains only $1$'s) covering all the $1$'s in matrix $M$ by $COV(M)$. (So $\log(COV(M))$ is the non-deterministic communication complexity of $M$.) It is easy to see that $NFA(L)\ge COV(M)$. If we similarly define a matrix $N$ as $N(ab;c)=1$ if $abc\in L$ and otherwise $0$, then we also have $NFA(L)\ge COV(N)$.

Is there an $L$ for which $NFA(L)>COV(M)+COV(N)?$

By what function of $COV(M)$ and $COV(N)$ can we upper bound $NFA(L)?$

Hermann Gruber · Accepted Answer · 2014-04-13T21:43:41.757

The bounds...

We have in fact $NFA(L) \ge Cov(M) + Cov(N)$, see Theorem 4 in (Gruber & Holzer 2006). For an upper bound, we have $2^{Cov(M)+Cov(N)} \ge DFA(L) \ge NFA(L)$, see Theorem 11 in the same paper.

...cannot be substantially improved

There can be a subexponential gap between $Cov(M)+Cov(N)$ and $NFA(L)$. The following example, and the proof of the gap, is an adaptation of a similar example illustrating the limitations of 2-party protocols for proving lower bounds on nondeterministic state complexity from (Hromkovič et al. 2009):

We use the alphabet $[n]= \{\,1,2,\ldots,n\,\}$. Let $L=\{\,xyz\in [n]^3 \mid x=y \vee x\neq z \,\}$.

We first take care of $Cov(M)$. Observe that if $w=xyz$ with $y=z$, then $w \in L$: in case $x=y$, $w\in L$ and in case $x\neq y$, we also have $x\neq z$ and thus $w\in L$. Also, if $w$ is of the form $xyz$ with $y\neq z$, then $w\in L$ iff $x\neq z$. So we can write $L = L'\cup L''$, with $L'=\{xyz\in [n]^3 \mid y=z\}$ and $L''=\{\,xyz\in [n]^3 \mid y\neq z \wedge x \neq z \,\}$.

Now consider the bipartite graphs $G' = (U',V',E')$ with $U'=[n]$, $V'= \{yz \in [n]^2 \mid y=z\}$, $E' = U' \times V'$, as well as $G'' = (U'',V'',E'')$ with $U''=[n]$, $V''= \{ yz \in [n]^2 \mid y\neq z\}$, $E'' = \{(x,yz) \mid x\neq z\}$, and $G = (U' \cup U'',V' \cup V'', E'\cup E'')$. Then a biclique edge cover for the graph $G$ gives rise to a covering of $M$ with $1$-monochromatic submatrices, and vice versa (Theorem 21 in Gruber & Holzer 2006).

A simple kernelization trick for computing a biclique edge cover for $G'$ is to put the twin vertices from $U'$ into equivalence classes. Then we do the same in the resulting graph for the twin vertices from $V'$. Twin vertices are those with identical neighborhood. This step does not alter the minimum number of bicliques needed to cover all edges in the respective graph.

The kernelization step collapses $G'$ into a graph with two vertices and a single edge. Thus, the edges of $G'$ can be covered with a single biclique. Applying the kernelization step to $G''$ yields a crown graph on $2n$ vertices, whose bipartite dimension (the minimum biclique edge cover number) is known to be $\sigma(n)$, where $\sigma$ is the inverse function of the middle binomial coefficient (De Caen et al. 1981). Notice that $\sigma(n)=O(\log n)$. Thus the bipartite dimension of $G$ is $1+\sigma(n)$, which is identical to $Cov(M)$.

Now consider $Cov(N)$. Observe that if $w=xyz$ with $x=y$, then $w \in L$. If $x\neq y$, then $x\in L$ iff $x\neq z$. So we can write $L = L''' \cup L''''$ with $L'''=\{xyz\in [n]^3 \mid x=y\}$ and $L''''= \{xyz\in [n]^3 \mid x\neq y \wedge x\neq z\}$. Almost the same argument as above yields $Cov(N)=1+\sigma(n)$.

It remains to give a lower bound on the nondeterministic state complexity of $L$. Observe that $L$ contains all words of the form $xxx$ with $x\in[n]$. For each such word $xxx$ fix an accepting computation of a minimal NFA accepting $L$. Let $p_x$ denote the state reached after reading the prefix $x$, and let $q_x$ denote the state reached after reading the prefix $xx$ of the input word $xxx$. Then all pairs $(p_x,q_x)$ must be different. For the sake of contradiction, assume $(p_x,q_x)=(p_y,q_y)$ for some $x\neq y$. Then we can construct an accepting computation on input $xyx$, such that the NFA is in state $p_x=q_x$ after reading the prefix $x$, and in state $q_y=q_x$ after reading the prefix $xy$. But the string $xyx$ is not in $L$. For the state set $Q$ of the NFA, this shows that $|Q|^2 \ge n$. Thus, for large $n$, we obtain a subexponential separation between $Cov(M)+Cov(N)$ and $|Q|$ (the nondeterministic state complexity of $L$).

References

Dominique de Caen, David A. Gregory, Norman J. Pullman: The Boolean rank of zero-one matrices, in: Cadogan, Charles C. (ed.), 3rd Caribbean Conference on Combinatorics and Computing, Department of Mathematics, University of the West Indies, pp. 169–173 (1981)
Hermann Gruber and Markus Holzer. Finding Lower Bounds for Nondeterministic State Complexity is Hard. Report TR06-027, Electronic Colloquium on Computational Complexity (ECCC), March 2006. Short version appeared in: Oscar H. Ibarra and Zhe Dang, editors, 10th International Conference on Developments in Language Theory (DLT 2006), Santa Barbara (CA), USA, volume 4036 of LNCS, pages 363-374. Springer, June 2006.
Juraj Hromkovic, Holger Petersen, Georg Schnitger: On the limits of the communication complexity technique for proving lower bounds on the size of minimal NFAs. Theor. Comput. Sci. 410(30–32): 2972–2981 (2009)

Nice, thank you! Now I have repaid you ;) – domotorp Apr 11 '14 at 21:07 — domotorp, Apr 11 '14 at 21:07

Lower bound for NFA accepting 3 letter language

1 Answers1

Linked