13

I have an algebraic problem related to vectors in the field GF(2). Let $v_1, v_2, \ldots, v_m$ be (0,1)-vectors of dimension $n$, and $m=n^{O(1)}$. Find a polynomial time algorithm that finds a (0,1)-vector $u$ of the same dimension such that $u$ is not the sum of any $(\log n)^{O(1)}$ vectors among $v_1,v_2, \ldots, v_m$. The addition of vectors is over the field GF(2), which has two elements 0 and 1 ($0+1=0+1=1$, and $0+0=1+1=0$).

It is easy to see the existence of such a vector u by a simple counting argument. Can we find $u$ in a polynomial time? It is trivial to find $u$ in exponential time. I will send a $200 check award for the first correct solution.

Bin Fu
  • 674
  • 4
  • 8
  • it seems vaguely related to the subset sum problem which is NP complete. however that uses full integer sum instead of XOR. – vzn Feb 21 '12 at 22:12
  • 1
    strangely I have recently been trying to formulate & look at a similar problem. try sec13.5 of stasys jukna book on boolean function complexity. it looks like your q can be formulated in terms of linear circuits in that chapter. – vzn Feb 21 '12 at 22:28
  • 1
    how about super-poly algorithms, i.e., m^log(n) ? – Dimitris Feb 21 '12 at 23:31
  • 1
    @Niel de Beaudrap: but the number of XORs you have to check is super-poly (i.e., roughly ${{m}\choose{\log(n)}}$), not poly. Isn't that a problem? – Dimitris Feb 21 '12 at 23:50
  • 1
    To extend vzn's remark: it would seem that nearly any vector satisfies your requirements, by the same counting argument. I imagine that you would also like a proof that a (perhaps randomly generated) vector is not contained in any subspace spanned by polylog(n) of the vectors: so your question is tantamount to showing that the problem of determining whether or not a candidate vector u doesn't belong to a subspace generated by some dimension f(n) ∈ polylog(n) of the vectors $\mathbf v_j$ is in NP. – Niel de Beaudrap Feb 22 '12 at 00:29
  • @NieldeBeaudrap No, the problem you state is different (although it's still hard). I don't know why you think that your problem being in NP would imply an algorithm to find $u$. – david Feb 22 '12 at 15:39
  • Even an O(n^{log n}) time deterministic algorithm for this problem would be very interesting. A random vector u has a high probability to satisfy the requirement, but it is unknown how to verify it in a polynomial time. – Bin Fu Feb 22 '12 at 16:16
  • @david: I'm not saying that this problem is the same, exactly: but because suitable vectors u should be extremely common (by the same counting argument hinted at by the OP), randomly guessing a vector u followed by checking whether it works should be sufficient, with very few repetitions. Do you disagree? – Niel de Beaudrap Feb 22 '12 at 17:40
  • @NieldeBeaudrap I agree that what you describe would give a randomized algorithm (although not a deterministic one), but I don't see that has anything to do with being in NP. Also the "checking" you need is probably a much harder problem. – david Feb 22 '12 at 19:43
  • Ok here is a very simple quazi-poly algorithm. Let $m=n^r$ and $\log^{O(1)}(n)=\log^t(n)$, for simplicity. Then, for each of the $S={{m}\choose{\log^t(n)}}=O\left(m^{\log^t(n)}\right)$ subsets of $\log^t(n)$ vectors generate their XOR. Now you have a list of $S$ binary vectors. Find a vector in their nullspace (poly in $S$), i.e. total complexity poly(N), quazi-poly in $m$. If you want to avoid the nullspace calculation you could transform all $S$ vectors in integers, then sort them, then pick one that is not in that list, and then transform this number back to a vector. Again quazi-poly time. – Dimitris Feb 23 '12 at 03:34
  • The random approach is to verify the randomly drawn vector by checking the $S$ XOR vectors, again in quazi-poly time. – Dimitris Feb 23 '12 at 03:38
  • A reformulation, I do not know if this can help. If $v=(v_1,\dots,v_n)$, let $\bar x^v=x_1^{v_1}x_2^{v_2}\dots x_n^{v_n}$. Then, $v+v'$ corresponds to $\bar x^v\bar x^{v'}\mod\langle x_1^2-1,\dots,x_n^2-1\rangle$. Consider now the $m$ vectors $v_1$, ..., $v_m$, and the corresponding monomials $\bar x^{v_1},\dots,\bar x^{v_m}$. Let $p(\bar x)=\sum_i\bar x^{v_i}$. Then, you are looking for a monomial which does not appear in $p(\bar x)^{\log^{\mathcal O(1)} n}\mod\langle x_i^2-1\rangle$. – Bruno Feb 23 '12 at 09:19

1 Answers1

8

There seems to be a typo; I assume you mean to find $u \in \{0,1\}^n$ which is not the sum of $(\log n)^{O(1)}$ vectors among $v_1,\dots, v_m$ (not $n$).

It's not clear to me if any constant in $(\log n)^{O(1)}$ works for you. If you can settle for sums of less than $\log m$ vectors maybe there's something to be done. But If you want this quantity to be $(\log m)^{1+\delta}$, then I think it is quite hard (I have been working on this problem for a long time).

Still you may be interested to know that this is an instance of the Remote Point Problem of Alon, Panigrahy and Yekhanin ("Deterministic Approximation Algorithms for the Nearest Codeword Problem ") for certain parameters. Let $m > n$ and $v_1,\dots,v_m$ be the columns of the parity check matrix of a linear code in $\{0,1\}^m$ of dimension $d = m - n$ (if this matrix didn't have full rank, the problem would be trivial). Then your problem is equivalent to finding $u \in \{0,1\}^n$ that is $(\log n)^{O(1)}$-far from the code. This setting of parameters, where the dimension is very close to m, is not studied in the paper. However, they can only achive remoteness $\log m$ up to dimension $d = cm$ for some constant $c$. In fact, I don't think we know of any polynomial-sized certificate that allows us to prove that some vector is more than $\omega(\log m)$-far from a space of dimension $\Omega(m)$, let alone find it.

Another connection is with learning parities in the mistake-bound model. If one can efficiently learn $(\log n)^{O(1)}$-parities (defined on ${0,1}^m$) with mistake bound strictly less than $n$, then one can set arbitrary values to the first $n - 1$ bits of $u$ and ``force a mistake'' on the last bit by setting it to the opposite value to that predicted by the learner. This seems much stronger though.

The problem is also related to separating EXP from certain reductions to sparse sets.

david
  • 947
  • 7
  • 13
  • 1
    Thanks for pointing out the typo. The last “v_n” should be “v_m”. Hope someone will fix it. Your answer contains helpful information. +1 – Bin Fu Feb 22 '12 at 16:19