Complexity of minimising polynomial formula size

Question

Let $f(x_1,\dots,x_n)$ be a degree $d$ polynomial in $n$ variables over $\mathbb{F}_2$, where $d$ is constant (say 2 or 3). I would like to find the smallest formula for $f$, where "formula" and "formula size" are defined in the obvious way (eg. the smallest formula for the polynomial $x_1 x_2 + x_1 x_3$ is $x_1(x_2+x_3)$).

What is the complexity of this problem - is it NP-hard? Does the complexity depend on $d$?

[ More formally, a formula (aka "arithmetic formula") is a rooted binary tree, each of whose leaves is labelled with either an input variable or the constant 1. All the other vertices of the tree are labelled with $+$ or $\times$. The size of the formula is the number of leaves used. The formula computes a polynomial recursively: $+$ vertices compute the sum of their children over $\mathbb{F}_2$, $\times$ vertices compute the product. ]

can't we reduce polynomial identity testing to this problem? — Kaveh, Sep 13 '11 at 12:48
I guess there may be a connection, but I don't immediately see it - in particular because of the constraint on the degree. Besides, if the problem is more difficult than polynomial identity testing, it would be interesting to know how much more difficult. — Ashley Montanaro, Sep 13 '11 at 13:26
In your case, how is the number of gates ($+$s, and $\times$s) in the formula related to the actual formula size? For $d=2$, the construction in Ehrenfeucht and Karpinski 90 seems to be relevant (see 2XOR paragraph) for the "gate"-formula size, but I have to think about it longer. — Alessandro Cosentino, Sep 14 '11 at 12:49
As the formula is a binary tree, the definition of formula size I've used here (number of leaves) is equal to the number of gates (internal vertices) plus one. But I'd be interested in any results for any other sensible definition of formula size too. I'm not sure I see a connection to the results of Ehrenfeucht and Karpinski, as these are about the complexity of counting solutions, rather than minimising formula size... — Ashley Montanaro, Sep 14 '11 at 15:10
In order to count the number of zeros, they first transform the formula to an equivalent one, which I recall being minimum in terms of multiplications and additions. I don't have a proof of this minimality, though. Again, this would answer only the case $d=2$. — Alessandro Cosentino, Sep 14 '11 at 16:02
I have asked a related question on MathOverflow expressed in terms of graphs. For the case $d = 2$, the two problems have related answers. — Niel de Beaudrap, Oct 28 '11 at 14:48

score 7 · Answer 1 · answered Sep 14 '11 at 15:48

7

You can reduce the co-NP-Complete TAUTOLOGY problem (given a Boolean formula, is it a tautology?) to the problem of minimizing formula size (since a formula is a tautology iff it's equivalent to TRUE). Moreover, TAUTOLOGY for 3DNFs (analogously to SAT for 3CNFs) is co-NP-Complete.

answered Sep 14 '11 at 15:48

Dana Moshkovitz

10,979
1
50
77

I thought about the same approach, but I could not make the degree of the resulting polynomial bounded by a constant. Is that possible? – Tsuyoshi Ito Sep 14 '11 at 15:56
1

As I understand the question, $f$ should be computed as a polynomial not as a function. Maybe some clarification is needed. – Markus Bläser Sep 14 '11 at 15:57
3

There is a probabilistic reduction from 3SAT to checking, given a deg-3 polynomial over GF(2), whether it has a zero [by looking at random linear combinations of the clauses], and then from this to checking, given a deg-3 poly over GF(2), whether it is all-zero [by subtracting the poly from 1]. – Dana Moshkovitz Sep 14 '11 at 17:31
1

Thanks! Do you have any idea what the situation is for degree 2 polynomials? Also (though this is probably very dense) I am struggling to see how a degree 3 polynomial over GF(2), written in standard form, can be all-zero without being the zero polynomial. To be clear, I am imagining that the input to my problem is a description of the polynomial itself, rather than a description of a circuit computing the polynomial. – Ashley Montanaro Sep 14 '11 at 18:50
I had been overlooking the random-subset technique. Thanks for the explanation! – Tsuyoshi Ito Sep 14 '11 at 21:33
I'm not sure about the deg-2 case. I would guess that if you think about it for a little and you can't argue it is equivalent to 2SAT, it's probably hard. (I don't have time to think about it now - I have a class to prepare for tomorrow :-)). 2. The polynomial x^2 - x is not identically zero, but it is all-zero over GF(2);

Dana Moshkovitz

Sep 15 '11 at 01:31

2

Thanks again for your reply. I'm still not convinced about the all-zero thing, though; it seems to me that any n-variate polynomial over GF(2) with poly(n) terms can easily be transformed into a standard form where it is obvious whether the polynomial is zero or not, just by making the substitution $x^k \rightarrow x$ and collecting terms. – Ashley Montanaro Sep 15 '11 at 07:01

4

Indeed if you make it multilinear as you describe, a polynomial evaluates to zero on every input iff it is the zero polynomial. One proof: Select a non-zero monomial M of minimal degree. Set to zero all other variables. The only surviving monomial is M. By setting the vars in M to 1 you get a non-zero output. – Manu Sep 15 '11 at 12:17

1

The mistake in my argument seems to be that the reduction from Boolean formulas to polynomials inherently produces an ensemble of polynomials, rather than a single polynomial. I'll try to think about the question when I have some more time. – Dana Moshkovitz Sep 17 '11 at 14:52

Klim · Answer 2 · 2011-09-21T19:35:05.903

4

Not exactly the answer but hopefully helps:

This question should be NP hard already for d=2 if you want to know minimal formula for $n$ polynomials and not just for one. The proof is as following: There exists one to one correspondence between n bi-linear formulas(formulas of type $\sum a_{ij}x_iy_j$) and tensor 3 matrices i.e. elements in $F_2^n\otimes F_2^n\otimes F_2^n $. Such that tensor rank of the matrix is exactly the multiplication complexity of n bi-linear formulas.

It is known that tensor rank $3$ is NP-hard problem(probably approximating tensor rank is also NP-hard). Thus multiplication complexity of $n$ bi-linear formulas is NP-hard problem

edited Sep 21 '11 at 19:35

answered Sep 21 '11 at 17:10

Klim

903
4
16

2

Thanks! This is an interesting perspective on the problem. – Ashley Montanaro Sep 21 '11 at 21:33
The following theorem helps to pass from many polynomials to one pollynomial: LEt S(f) complexity of one polynomial then complexity of computing all its derivatives is at most 5S(f). Thus the complexity polynomials $f_1, f_2,\ldots,f_n$ is almost equal to complexity of $z_1f_1+z_2f_2\ldots z_n f_n$ – Klim Sep 25 '11 at 03:35
If you talk about tensor rank, then you are only counting multiplications but not additions. The case $d = 2$ and only one bilinear form is easy then, since one can compute the rank of one bilinear form, by using the structure theorems mentioned in Ramprasad's answer. (The proofs of these theorems is algorithmic, see the book by Lidl & Niederreiter.) – Markus Bläser Sep 26 '11 at 13:24

Jacques Carette · Answer 3 · 2011-09-20T12:11:30.967

Any answer to this depends hugely on the vocabulary you allow in the answer. If you want your answer in the same language as the input (i.e. as a polynomial), that leads to one set of answers, which is what other posters have been struggling with.

But if you allow your answer vocabulary to be enlarged, wonderful things can happen. You can see an example in symbolic vs automatic differentiation: in symbolic differentiation one only allows 'expressions', which tend to blow up pretty badly; in automatic differentiation, one allows straight-line programs in the answer (even if the input was an expression), which greatly helps to control the expression swell. For univariate polynomials, James Davenport and I have mused that you need to throw in cyclotomic polynomials as part of your basic vocabulary as well (see the references as to why these polynomials seem to be the only real source of blow-up, as well as the papers that show various reducibility results between polynomial problems and 3SAT).

In other words, if you allow yourself to vary what you consider an answer a little bit from the classical one, you may just be able to get a rather different answer, i.e. one with a much better complexity. It depends on your original motivation for asking the question, whether purely theoretical or with an application in mind, to decide whether this variation in vocabulary is acceptable to you. In the setting where James and I have been thinking about this (symbolic computation), adjusting the vocabulary to make the complexity drop is perfectly acceptable (though seldom done).

The question asks for the smallest arithmetic formula, which it then defines clearly. I am therefore not sure this reply is directly relevant. Also, the above answer by Dana Moshkovitz and associated comments don't correctly answer the question as already acknowledged in the comments. — Simd, Sep 18 '11 at 17:06
The point of my answer is that the OP might not realize that they are not necessarily asking the best question. The OP's question is asked in very classical terms, but if you allow a small deviation from that, you get quite different answers, which might have been quite unexpected. I understand your comment, but feel the downvote is a little harsh. — Jacques Carette, Sep 20 '11 at 00:46
Could you correct the first paragraph of your answer to make it clear the question has not been answered correctly yet? I was worried people might be mislead. — Simd, Sep 20 '11 at 04:14

Ramprasad · Answer 4 · 2011-09-20T08:30:59.290

The general circuit/formula minimization is certainly harder than identity testing, since the minimum formula size of any identity is simply zero. As for how much harder, I don't have a definitive answer but perhaps the "reconstruction algorithms" studied in arithmetic circuits/formulae might be something along these lines.

In these cases, you are give a blackbox and told that it is a formula in some class $\mathcal{C}$ (say a depth $3$ circuit). The goal is to construct a representation of the blackbox in (something close to) $\mathcal{C}$. Typically, most reconstruction results assume blackbox identity tests for the class, randomness, and sometimes other kinds of queries. Such reconstruction algorithms are available for certain restricted classes of circuits but not all classes for which we know blackbox PITs. Shpilka and Yehudayoff have a fantastic survey (pdf) on arithmetic circuits, and one of the chapters is entirely on reconstruction algorithms.

But in your case, you say $d$ is a constant and hence even if the input was given as a blackbox, there are reconstruction algorithms for sparse polynomials. So maybe the above comments are not too interesting in this case.

Also, in the case of $d=2$, there are structure theorems for quadratics. Under a linear transformation on the variables, any quadratic can be rewritten in the form $x_1x_2 + x_3x_4 + .. + x_{2k-1}x_{2k} + \ell$. This property was used by Bogdanov and Viola for constructing PRGs for low degree polynomials (pdf) (Lemma 17 of their paper).

Thanks for your comments. Sadly, I don't see how to use these ideas to solve the original problem. — Ashley Montanaro, Sep 20 '11 at 12:33

Complexity of minimising polynomial formula size

4 Answers4