28

Let $f(x_1,\dots,x_n)$ be a degree $d$ polynomial in $n$ variables over $\mathbb{F}_2$, where $d$ is constant (say 2 or 3). I would like to find the smallest formula for $f$, where "formula" and "formula size" are defined in the obvious way (eg. the smallest formula for the polynomial $x_1 x_2 + x_1 x_3$ is $x_1(x_2+x_3)$).

What is the complexity of this problem - is it NP-hard? Does the complexity depend on $d$?

[ More formally, a formula (aka "arithmetic formula") is a rooted binary tree, each of whose leaves is labelled with either an input variable or the constant 1. All the other vertices of the tree are labelled with $+$ or $\times$. The size of the formula is the number of leaves used. The formula computes a polynomial recursively: $+$ vertices compute the sum of their children over $\mathbb{F}_2$, $\times$ vertices compute the product. ]

Ashley Montanaro
  • 2,013
  • 18
  • 21
  • 1
    can't we reduce polynomial identity testing to this problem? – Kaveh Sep 13 '11 at 12:48
  • 4
    I guess there may be a connection, but I don't immediately see it - in particular because of the constraint on the degree. Besides, if the problem is more difficult than polynomial identity testing, it would be interesting to know how much more difficult. – Ashley Montanaro Sep 13 '11 at 13:26
  • In your case, how is the number of gates ($+$s, and $\times$s) in the formula related to the actual formula size? For $d=2$, the construction in Ehrenfeucht and Karpinski 90 seems to be relevant (see 2XOR paragraph) for the "gate"-formula size, but I have to think about it longer. – Alessandro Cosentino Sep 14 '11 at 12:49
  • As the formula is a binary tree, the definition of formula size I've used here (number of leaves) is equal to the number of gates (internal vertices) plus one. But I'd be interested in any results for any other sensible definition of formula size too. I'm not sure I see a connection to the results of Ehrenfeucht and Karpinski, as these are about the complexity of counting solutions, rather than minimising formula size... – Ashley Montanaro Sep 14 '11 at 15:10
  • In order to count the number of zeros, they first transform the formula to an equivalent one, which I recall being minimum in terms of multiplications and additions. I don't have a proof of this minimality, though. Again, this would answer only the case $d=2$. – Alessandro Cosentino Sep 14 '11 at 16:02
  • I have asked a related question on MathOverflow expressed in terms of graphs. For the case $d = 2$, the two problems have related answers. – Niel de Beaudrap Oct 28 '11 at 14:48

4 Answers4

7

You can reduce the co-NP-Complete TAUTOLOGY problem (given a Boolean formula, is it a tautology?) to the problem of minimizing formula size (since a formula is a tautology iff it's equivalent to TRUE). Moreover, TAUTOLOGY for 3DNFs (analogously to SAT for 3CNFs) is co-NP-Complete.

Dana Moshkovitz
  • 10,979
  • 1
  • 50
  • 77
  • I thought about the same approach, but I could not make the degree of the resulting polynomial bounded by a constant. Is that possible? – Tsuyoshi Ito Sep 14 '11 at 15:56
  • 1
    As I understand the question, $f$ should be computed as a polynomial not as a function. Maybe some clarification is needed. – Markus Bläser Sep 14 '11 at 15:57
  • 3
    There is a probabilistic reduction from 3SAT to checking, given a deg-3 polynomial over GF(2), whether it has a zero [by looking at random linear combinations of the clauses], and then from this to checking, given a deg-3 poly over GF(2), whether it is all-zero [by subtracting the poly from 1]. – Dana Moshkovitz Sep 14 '11 at 17:31
  • 1
    Thanks! Do you have any idea what the situation is for degree 2 polynomials? Also (though this is probably very dense) I am struggling to see how a degree 3 polynomial over GF(2), written in standard form, can be all-zero without being the zero polynomial. To be clear, I am imagining that the input to my problem is a description of the polynomial itself, rather than a description of a circuit computing the polynomial. – Ashley Montanaro Sep 14 '11 at 18:50
  • I had been overlooking the random-subset technique. Thanks for the explanation! – Tsuyoshi Ito Sep 14 '11 at 21:33
  • I'm not sure about the deg-2 case. I would guess that if you think about it for a little and you can't argue it is equivalent to 2SAT, it's probably hard. (I don't have time to think about it now - I have a class to prepare for tomorrow :-)). 2. The polynomial x^2 - x is not identically zero, but it is all-zero over GF(2);
  • – Dana Moshkovitz Sep 15 '11 at 01:31
  • 2
    Thanks again for your reply. I'm still not convinced about the all-zero thing, though; it seems to me that any n-variate polynomial over GF(2) with poly(n) terms can easily be transformed into a standard form where it is obvious whether the polynomial is zero or not, just by making the substitution $x^k \rightarrow x$ and collecting terms. – Ashley Montanaro Sep 15 '11 at 07:01
  • 4
    Indeed if you make it multilinear as you describe, a polynomial evaluates to zero on every input iff it is the zero polynomial. One proof: Select a non-zero monomial M of minimal degree. Set to zero all other variables. The only surviving monomial is M. By setting the vars in M to 1 you get a non-zero output. – Manu Sep 15 '11 at 12:17
  • 1
    The mistake in my argument seems to be that the reduction from Boolean formulas to polynomials inherently produces an ensemble of polynomials, rather than a single polynomial. I'll try to think about the question when I have some more time. – Dana Moshkovitz Sep 17 '11 at 14:52