8

Related to a recent question (Bounds on the size of the smallest NFA for L_k-distinct) Noam Nisan asked for a method to give a better lower bound for the size of an NFA than what we get from communication complexity bounds. What follows is a special version of that problem.

Suppose $L$ is a language over some $n$ letter alphabet whose words all have length $3$. Denote the size of the smallest NFA accepting $L$ by $NFA(L)$. Define an $n\times n^2$ matrix $M$ as $M(a;bc)=1$ if $abc\in L$ and otherwise $0$. Denote the minimal number of $1$-submatrices (submatrix that contains only $1$'s) covering all the $1$'s in matrix $M$ by $COV(M)$. (So $\log(COV(M))$ is the non-deterministic communication complexity of $M$.) It is easy to see that $NFA(L)\ge COV(M)$. If we similarly define a matrix $N$ as $N(ab;c)=1$ if $abc\in L$ and otherwise $0$, then we also have $NFA(L)\ge COV(N)$.

Is there an $L$ for which $NFA(L)>COV(M)+COV(N)?$

By what function of $COV(M)$ and $COV(N)$ can we upper bound $NFA(L)?$

domotorp
  • 13,931
  • 37
  • 93

1 Answers1

12

The bounds...

We have in fact $NFA(L) \ge Cov(M) + Cov(N)$, see Theorem 4 in (Gruber & Holzer 2006). For an upper bound, we have $2^{Cov(M)+Cov(N)} \ge DFA(L) \ge NFA(L)$, see Theorem 11 in the same paper.

...cannot be substantially improved

There can be a subexponential gap between $Cov(M)+Cov(N)$ and $NFA(L)$. The following example, and the proof of the gap, is an adaptation of a similar example illustrating the limitations of 2-party protocols for proving lower bounds on nondeterministic state complexity from (Hromkovič et al. 2009):

We use the alphabet $[n]= \{\,1,2,\ldots,n\,\}$. Let $L=\{\,xyz\in [n]^3 \mid x=y \vee x\neq z \,\}$.

We first take care of $Cov(M)$. Observe that if $w=xyz$ with $y=z$, then $w \in L$: in case $x=y$, $w\in L$ and in case $x\neq y$, we also have $x\neq z$ and thus $w\in L$. Also, if $w$ is of the form $xyz$ with $y\neq z$, then $w\in L$ iff $x\neq z$. So we can write $L = L'\cup L''$, with $L'=\{xyz\in [n]^3 \mid y=z\}$ and $L''=\{\,xyz\in [n]^3 \mid y\neq z \wedge x \neq z \,\}$.

Now consider the bipartite graphs $G' = (U',V',E')$ with $U'=[n]$, $V'= \{yz \in [n]^2 \mid y=z\}$, $E' = U' \times V'$, as well as $G'' = (U'',V'',E'')$ with $U''=[n]$, $V''= \{ yz \in [n]^2 \mid y\neq z\}$, $E'' = \{(x,yz) \mid x\neq z\}$, and $G = (U' \cup U'',V' \cup V'', E'\cup E'')$. Then a biclique edge cover for the graph $G$ gives rise to a covering of $M$ with $1$-monochromatic submatrices, and vice versa (Theorem 21 in Gruber & Holzer 2006).

A simple kernelization trick for computing a biclique edge cover for $G'$ is to put the twin vertices from $U'$ into equivalence classes. Then we do the same in the resulting graph for the twin vertices from $V'$. Twin vertices are those with identical neighborhood. This step does not alter the minimum number of bicliques needed to cover all edges in the respective graph.

The kernelization step collapses $G'$ into a graph with two vertices and a single edge. Thus, the edges of $G'$ can be covered with a single biclique. Applying the kernelization step to $G''$ yields a crown graph on $2n$ vertices, whose bipartite dimension (the minimum biclique edge cover number) is known to be $\sigma(n)$, where $\sigma$ is the inverse function of the middle binomial coefficient (De Caen et al. 1981). Notice that $\sigma(n)=O(\log n)$. Thus the bipartite dimension of $G$ is $1+\sigma(n)$, which is identical to $Cov(M)$.

Now consider $Cov(N)$. Observe that if $w=xyz$ with $x=y$, then $w \in L$. If $x\neq y$, then $x\in L$ iff $x\neq z$. So we can write $L = L''' \cup L''''$ with $L'''=\{xyz\in [n]^3 \mid x=y\}$ and $L''''= \{xyz\in [n]^3 \mid x\neq y \wedge x\neq z\}$. Almost the same argument as above yields $Cov(N)=1+\sigma(n)$.

It remains to give a lower bound on the nondeterministic state complexity of $L$. Observe that $L$ contains all words of the form $xxx$ with $x\in[n]$. For each such word $xxx$ fix an accepting computation of a minimal NFA accepting $L$. Let $p_x$ denote the state reached after reading the prefix $x$, and let $q_x$ denote the state reached after reading the prefix $xx$ of the input word $xxx$. Then all pairs $(p_x,q_x)$ must be different. For the sake of contradiction, assume $(p_x,q_x)=(p_y,q_y)$ for some $x\neq y$. Then we can construct an accepting computation on input $xyx$, such that the NFA is in state $p_x=q_x$ after reading the prefix $x$, and in state $q_y=q_x$ after reading the prefix $xy$. But the string $xyx$ is not in $L$. For the state set $Q$ of the NFA, this shows that $|Q|^2 \ge n$. Thus, for large $n$, we obtain a subexponential separation between $Cov(M)+Cov(N)$ and $|Q|$ (the nondeterministic state complexity of $L$).

References

Hermann Gruber
  • 6,470
  • 1
  • 35
  • 58