An NP-complete variant of factoring.

Question

Arora and Barak's book presents factoring as the following problem:

$\text{FACTORING} = \{\langle L, U, N \rangle \;|\; (\exists \text{ a prime } p \in \{L, \ldots, U\})[p | N]\}$

They add, further in Chapter 2, that removing the fact that $p$ is prime makes this problem NP-complete, although this is not linked to the difficulty of factoring numbers. It looks there can be a reduction from SUBSETSUM, but I got stuck finding it. Any better luck around here?

EDIT March 1st: The bounty is for $NP$-completeness proof using deterministic Karp (or Cook) reduction.

@turkistany: FWIW, I consider as bad style to put NP in italic, and as both bad style and bad LaTeX to put it in math mode (as the spacing between letters differ). — Michaël Cadilhac, Feb 07 '11 at 13:49
@Michaël, Sorry, reverted back to the original style. I got excited by your question :) — Mohammad Al-Turkistany, Feb 07 '11 at 13:52
A somewhat more complete description: On page 63 of the book, they write: Alon and Kilian (in personal communication) showed that in the definition of the language Factoring in Example 2.3, the condition that the factor p is prime is necessary to capture the factoring problem, since without this condition this language is NP-complete (for reasons having nothing to do with the hardness of factoring integers). — Sadeq Dousti, Feb 07 '11 at 14:35
Naturally, I searched for a paper by Alon and Kilian containing “factoring” and “NP-complete.” I found none (I guess that this is also natural in some sense). :( — Tsuyoshi Ito, Feb 07 '11 at 14:57
@Michael I actually like rendering classes as $\mathsf{NP}$ rather than NP. No ? — Suresh Venkat, Feb 07 '11 at 21:06
@Suresh: Yes, this is something I can root for, even though my typography professor always sheds a tear when multiple fonts (and serifs) appear in a single sentence :-) But sans-serif for classes is neat. — Michaël Cadilhac, Feb 07 '11 at 22:35
I'm thinking of a reduction from the Subset Product problem which is $NP$-complete. Given a set $S$ of integer numbers and integer $k$, Is there a subset $T \subset S$ such that product of all elements is $T$ equals $k$. ($k= \prod_{a_i \in T} a_i$).
Let $N= \prod_{a_i \in S} a_i$. Now, $S$ has a subset $T$ such that $k= \prod_{a_i \in T} a_i$ if and only if $[k|N]$. What is the flaw? — Mohammad Al-Turkistany, Feb 08 '11 at 09:29
So the reduction maps $(S, k) \rightarrow (k,k, \prod_{a_i \in S} a_i)$ — Mohammad Al-Turkistany, Feb 08 '11 at 11:03
Aren't you asuming that the $a_i$'s are prime numbers? But this is the angle on which I attacked the problem at first. — Michaël Cadilhac, Feb 08 '11 at 12:45
@Michaël, No. The definition of Subset Product problem does not require $a_i$'s to be prime numbers. — Mohammad Al-Turkistany, Feb 08 '11 at 12:50
What I meant is that if $S = {4}$ and $k = 2$, you'd output True, isn't it? (Or $S = {2,3,5,7,11,26}$ and $k = 13$ if you want to cipher the example.) This is why we tried to translate this problem using exponents (Subset Exp?), with no real luck. — Michaël Cadilhac, Feb 08 '11 at 13:12
In general, $k$ may not be prime. So, $a_i$'s and $k$ are not necessarily prime numbers. — Mohammad Al-Turkistany, Feb 08 '11 at 13:20
Sorry, I probably didn't make myself clear. I'm saying that using your reduction, $S={4}$ and $k = 2$ is a counter-example (as Subset Product answers "false" and the FactoringNonPrime would answer "true"). Am I misunderstanding something? — Michaël Cadilhac, Feb 08 '11 at 13:24
$S={ 4 }$ and $k=2$ is Yes example for Subset Product problem based on its definition. The certificate for Factoring problem does not need to be non-prime. The only requirement in both problems is that $a_i$'s, $k$, and $p$ are integers. — Mohammad Al-Turkistany, Feb 08 '11 at 13:35
Sorry to insist there, but the only subsets of $S = {4}$ being $S$ and $\emptyset$, the only products of all elements in them are $4$ and $0$, none of which is $2$. — Michaël Cadilhac, Feb 08 '11 at 13:39
Thanks. I got it. The reduction does not work. I'm gonna try to salvage it. :) — Mohammad Al-Turkistany, Feb 08 '11 at 13:50
Is it clear that Alon and Kilian proved this problem was NP-hard, and not just NP-hard under randomized reductions? — Peter Shor, Mar 01 '11 at 21:41
@Peter, It is not clear. the referenced textbook states that it is NP-complete (most probably using deterministic reduction). — Mohammad Al-Turkistany, Mar 02 '11 at 14:07
I would say so too, as (at least at this point in the book) only deterministic reduction is presented. But I very much like the last formulation of your solution, Peter; so I'll just wait a little bit more to see if someone can come up with a deterministic version, and otherwise, I'll gladly accept your answer. — Michaël Cadilhac, Mar 03 '11 at 17:30
@MohammadAl-Turkistany: I think that SUBSET PRODUCT is not strongly NP-complete, see the final EDIT of this question on cstheory — Marzio De Biasi, May 24 '13 at 09:18
@MohammadAl-Turkistany A new question was posted here http://mathoverflow.net/questions/212816/are-there-effective-small-intervals-in-which-primes-are-dense. Does this improve Shor's answer? — Turbo, Aug 02 '15 at 00:58
@MohammadAl-Turkistany Is there an NP complete variant of Discrete logarithm problem? — Turbo, May 01 '17 at 11:09
http://theory.cs.princeton.edu/complexity/book.pdf doesn't seem to be consistent with this question any more. The problem is defined without the restriction that the factor is prime and I can't see a claim that it is NP-complete. Has it been updated despite saying it is from 2007 on the front page? — , Jun 05 '19 at 11:07

Peter Shor · Accepted Answer · 2011-03-01T21:36:52.267

38

This is not quite an answer, but it's close. The following is a proof that the problem is NP-hard under randomized reductions.

There's an obvious relation to subset sum which is: suppose you know the factors of $N$: $p_1$, $p_2$, $\ldots$, $p_k$. Now, you want to find a subset $S$ of $p_1$ $\ldots$ $p_k$ such that

$$\displaystyle \log L \leq \sum_{p_i \in S} \log p_i \leq \log U.$$

The problem with trying to use this idea to show the problem is NP-hard is that if you have a subset-sum problem with numbers $t_1$, $t_2$, $\ldots$, $t_k$, you can't necessarily find primes in polynomial time such that $\log p_i \propto t_i$ (where by $\propto$, I mean approximately proportional to). This is a real problem because, since subset-sum is not strongly NP-complete, you need to find these $\log p_i$ for large integers $t_i$.

Now, suppose we require that all the integers $t_1$ $\ldots$ $t_k$ in a subset sum problem are between $x$ and $x(1+1/k)$, and that the sum is approximately $\frac{1}{2}\sum_i t_i$. The subset sum problem will still be NP-complete, and any solution will be the sum of $k/2$ integers. We can change the problem from integers to reals if we let $t'_i$ be between $t_i$ and $t_i+\frac{1}{10k}$, and instead of requiring the sum to be exactly $s$, we require it to be between $s$ and $s + \frac{1}{10}$. We only need to specify our numbers to around $4 \log k$ more bits of precision to do this. Thus, if we start with numbers with $B$ bits, and we can specify real numbers $\log p_i$ to approximately $B + 4 \log k$ bits of precision, we can carry out our reduction.

Now, from wikipedia (via Hsien-Chih's comment below), the number of primes between $T$ and $T+ T^{5/8}$ is $\theta(T^{5/8}/\log T)$, so if you just choose numbers randomly in that range, and test them for primality, with high probability get a prime in polynomial time.

Now, let's try the reduction. Let's say our $t_i$ are all $B$ bits long. If we take $T_i$ of length $3B$ bits, then we can find a prime $p_i$ near $T_i$ with $9/8B$ bits of precision. Thus, we can choose $T_i$ so that $\log T_i \propto t_i$ with precision $9/8\, B$ bits. This lets us find $p_i \approx T_i$ so that $\log p_i \propto t_i$ with precision $9/8\,B$ bits. If a subset of these primes multiplies to something close to the target value, a solution exists to the original subset sum problems. So we let $N=\Pi_i p_i$, choose $L$ and $U$ appropriately, and we have a randomized reduction from subset sum.

edited Mar 01 '11 at 21:36

answered Feb 07 '11 at 23:51

Peter Shor

24,763
4
93
133

3

I do not understand the reduction. For the subset sum problem to be NP-complete, the number must be given in binary. If we want integers whose logarithms are close to the numbers in an instance of the subset sum problem, we need exponentially many digits. How do you overcome this? – Tsuyoshi Ito Feb 08 '11 at 00:25
We don't need integers whose logarithms are close to the numbers in an instance of the subset sum problem, we only need integers whose logarithms are approximately proportional to the integers; the constant of proportionality can be chosen so the logarithms are actually much smaller than the integers givn in the instance.. I've edited my answer to try to explain this more clearly. – Peter Shor Feb 08 '11 at 02:50
2

@Peter: The assumption in number theory is called the Cramér's conjecture, which states that $p_{n+1} - p_n = O(\log^2 n)$, where $p_n$ is the n-th prime number. See the article prime gap also for reference. – Hsien-Chih Chang 張顯之 Feb 08 '11 at 03:24
1

(cont.) But prime number theorem already gave us an average $O(\log n)$ prime gap. Is that enough under the high probability requirement? No? – Hsien-Chih Chang 張顯之 Feb 08 '11 at 03:32
1

Cramér's conjecture is more powerful than I need, but the prime number theorem isn't quite enough. For this result, you would need that if you pick a number at random, with high probability you are in a gap of size $O(\log^c n)$ for some constant $c$. The prime number theorem doesn't imply that. – Peter Shor Feb 08 '11 at 04:31
@Peter: Ah, I see where the problem is. Thanks for the comment! – Hsien-Chih Chang 張顯之 Feb 08 '11 at 05:33
I'm thinking of a reduction from the Subset Product problem which is $NP$-complete. Given a set $S$ of integer numbers and integer $k$, Is there a subset $T \subset S$ such that product of all elements is $T$ equals $k$. ($k= \prod_{a_i \in T} a_i$).
Let $N= \prod_{a_i \in S} a_i$. Now, $S$ has a subset $T$ such that $k= \prod_{a_i \in T} a_i$ if and only if $[k|N]$. What is the flaw?
– Mohammad Al-Turkistany Feb 08 '11 at 09:18
@turkistany: This would work if the elements of $T$ were prime. However, if the elements of $T$ are prime, the problem is reducible to factoring. – Peter Shor Feb 08 '11 at 13:37
Actually, I think that by using more precision in the $p_i$, you can get by with the following number theory assertion: There is a constant $\alpha < 1$ such that with high probability ($1-1/\log^{c_1} T$), for a random integer $T$, there are $T^\alpha/\log^{c_2} T$ primes between $T$ and $T + T^\alpha$, for some constants $c_1$ and $c_2$. This has a chance of being proven already. – Peter Shor Feb 08 '11 at 16:28
2

@Peter: Yes, this version of assumption has been proved for $T$ large enough. The first result of this kind is shown by Hoheisel, and the best result due to wikipedia is the work by Baker, Harman and Pintz, with $\alpha = 0.525$, $c_1 = \infty$ (since it holds for probability 1) and $c_2 = 1$. – Hsien-Chih Chang 張顯之 Feb 08 '11 at 17:56
I had overlooked the “proportional to” part. Thanks! – Tsuyoshi Ito Feb 08 '11 at 18:03
1

Hsien-Chih: Thanks. I've now edited my answer to use your comment and eliminate any unproven number theory assumptions. – Peter Shor Mar 01 '11 at 21:43
1

@Peter Shor: According to this article http://rjlipton.wordpress.com/2009/06/09/computing-very-large-sums/ your answer gives the correct reduction (but there isn't any hint if they use also randomized reductions). – Marc Bury Mar 05 '11 at 01:33
3

Just came across this. I should note that I don't know what the original Kilian-Alon proof was. My only knowledge of the proof is from a communication with Noga who didn't remember the details of the original proof, and the proof he reconstructed was exactly this one. Note that it can also be described as a deterministic reduction under some strong number theoretic assumptions (e.g., that there is a prime in any interval of the form [x,x+polylog(x)] ). – Boaz Barak Sep 27 '12 at 04:18
BTW this is probably just nonsense due to the late hour, but perhaps a way to make the reduction deterministic is if (1) for some magical reason it is actually known that there is an almost prime in the interval [x,x+polylog(x)], (2) the subset sum problem is still NP hard if you consider solutions that rather than either taking or not taking each element, can take, say, an integer multiple of 1/10th of the element. – Boaz Barak Sep 27 '12 at 04:47
One further observation: $:$ the problem is NP-hard under zero-error randomized reductions. – Jul 01 '13 at 02:28
5

I just talked to Joe Kilian. He said that the proof that he and Alon came up with involved zero-error randomized reductions. As far as he is aware, deterministic reduction is still open unless you make some number-theoretic assumption, as Boaz Barak has already said. – Timothy Chow Dec 31 '14 at 03:04
@TimothyChow Boaz says two conditions have to be satisfied. Only one of which is number theoretic and the other condition is vaguely written as "the subset sum problem is still NP hard if you consider solutions that rather than either taking or not taking each element, can take, say, an integer multiple of 1/10th of the element". – Turbo Jan 01 '15 at 05:00
2

@Turbo : Boaz gives two conditions in his second comment, but his first comment states only one condition, which is number-theoretic. As for your other question, the point of my comment was that Joe Kilian does not know anything beyond what has already been posted here. – Timothy Chow Jan 02 '15 at 15:49
@RickyDemer is counting version of this problem #P complete assuming Cramer's conjecture? – Turbo May 20 '17 at 09:26
1

@Turbo : I'll start by being terse, so feel free to ask for elaboration. Reduce from 3-SAT. There are three (I only need there to be at least one) ways to [[choose w and k from {1,2}] and [choose t from {7,8}] and [choose three distinct elements a,b,c of {1,2,3,4,5,6,7}]] such that [[t is the only number congruent to t mod 9 which is a sum of a submultiset of {a,b,c,w,w,w}] and [t is not a sum of a subset of {a,b,c}] and [t-w,t-(2w),t-(3w) are each sums of subsets of {a,b,c} in a unique way] and [those ways each use exactly k elements of {a,b,c}]]. – May 21 '17 at 09:55
1

(continued ...) Such choices, together with one-coordinate-per-variable in which the literals are 1 and everything else is 0 and the target is 1, yield a strongly parsimonious reduction from 3-SAT to vector subset-multi-set sum such that non-solutions aren't even congruent mod 9 to solutions. To change that to vector subset sum, append another coordinate-per-propositional-variable in which the positive literals are 1 and everything else is 0 and the target is 1. From that, interpret the tuples as non-negative integers in base 9. (... continued) – May 21 '17 at 10:09
1

(continued ...) Summing subsets induced by solutions to the vector version will not result in carries so they will still be solutions to the integer version, and summing subsets for which that summing involves a carry will result in the sum differing from the target in the least-significant position from which a carry was propagated. – May 21 '17 at 10:12
@Ricky so the difficulty is because of carries? I do not follow your reduction that this problem is #P complete under Cramer's conjecture (also there are no primes involved in your reduction which maybe strange). – Turbo May 21 '17 at 22:53
@Turbo : Well, the difficulty in what I was describing is the carries. The primes are for the subset-sum to factorization-variant reduction, as described in Peter Shor's answer. – May 22 '17 at 01:29
@RickyDemer 'The primes are for the subset-sum to factorization-variant reduction, as described in Peter Shor's answer'. Do you provide a different reduction from #3SAT to #FACTORIZATION-VARIANT? – Turbo May 22 '17 at 05:42
1

@Turbo : No, I provide a strongly-parsimonious reduction from #3SAT to #-subset-sum such that all solutions of the latter have a known size, and then rely on Peter Shor's answer. – May 22 '17 at 05:47
@RickyDemer ok that makes sense. BTW what is strongly parsimonious? Is it in https://cs.stackexchange.com/questions/9744/definition-of-strongly-parsimonious-reduction? – Turbo May 22 '17 at 07:02
1

@Turbo : That's likely the more common definition, but everywhere I use the term, I mean the bijection's inverse is also efficiently computable. – May 22 '17 at 07:06
@RickyDemer Does the problem stay $NP$-hard under zero-error randomized reductions even if $N$ has all primes factors distinct with multiplicity $1$ and equal bit length? – Turbo Aug 21 '17 at 21:33
@RickyDemer would you know the answer? – Turbo Aug 24 '17 at 07:26
@PeterShor It seems that if this problem is in $P$ or in $BPP$ then $NP$ is in $P/poly$. Is this true? – VS. Aug 18 '20 at 11:19
@VS. Since BPP ⊆ P/poly, this is true. – Peter Shor Aug 18 '20 at 12:46
@PeterShor Sorry are you suggesting if this problem is in bpp then np is in bpp? I am aware of Adleman's result. I just noticed this problem does not have deterministic reduction. – VS. Aug 18 '20 at 13:38
1

@VS. BPP is a randomized class. You don't need a deterministic reduction. If an NP-complete problem can be reduced to a BPP problem via a randomized reduction, then NP ⊆ BPP, and since BPP ⊆ P/poly, we also get that NP ⊆ P/poly. – Peter Shor Aug 18 '20 at 14:31
@PeterShor Sorry I think (from user6973's comment and) your comment that this also proves $NP\subseteq ZPP$ if this problem is in $P$ since this problem in $P$ implies we have $ZPP$ reduced an $NP$ complete problem to a $P$ problem and that $NP=coNP$ since $ZPP$ is closed under complement. Correct? – VS. Aug 24 '20 at 09:44
@VS. That's correct. – Peter Shor Aug 24 '20 at 11:02

score 7 · Answer 2 · answered Mar 01 '11 at 21:09

I think that it is linked to the PCP theorem (in particular $NP = PCP[O(\log{n}), O(1)]$).

An excerpt from a Madhu's paper:

... The notion that a verifier can perform any polynomial time computation enriches the class of theorems and proofs considerably and starts to offer highly non-trivial methods of proving theorems. (One immediate consequence is that we can assume theorems/proofs/assertions/arguments are binary sequences and we will do so henceforth.) For instance, suppose we have an assertion $A$ (say the Riemann Hypothesis), and say we believe that it has proof which would fit within a 10,000 page article. The computational perspective says that given $A$ and this bound (10,000 pages), one can efficiently compute three positive integers $N, L, U$ with $L \leq U \leq N$ such that $A$ is true if and only if $N$ has a divisor between $L$ and $U$. The integers $N$, $L$, and $U$ will be quite long (maybe writing them would take a million pages), yet they can be produced extremely efficiently (in less than the amount of time it would take a printer to print out all these integers, which is certainly at most a day or two). (This specific example is based on a result due to Joe Kilian, personal communication) ...

... far beyond my complexity theory skills :-)

This is just another formulation that this problem is NP-complete. — Marc Bury, Mar 02 '11 at 15:53
@Marc ... mmm ... I think that it means: ${\langle L, U, N \rangle ;|; (\exists p \in {L, \ldots, U})[p | N]}$ is NP-complete because a polynomial reduction from the NP-complete problem SHORT PROOF exists ... — Marzio De Biasi, Mar 02 '11 at 17:47
The SHORT PROOFS problem in the paper is almost the same as the bounded halting problem. A reduction from the SHORT PROOFS problem would be most likely as messy as the typical proof of the NP-completeness of SAT, and therefore it is unlikely that the proof of the NP-completeness of this factor finding problem by Kilian constructs a reduction from the SHORT PROOFS problem directly. — Tsuyoshi Ito, Mar 03 '11 at 23:33

Mohammad Al-Turkistany · Answer 3 · 2017-04-08T18:39:51.120

This is an informal efficient deterministic reduction idea (and may be incomplete):

Fractran is a Turing-complete programming language. A suitably defined bounded-version of Fractran programs should be reducible to the language $ \{\langle L, U, M \rangle \;|\; (\exists \text{ a positive integer } p \in \{L, \ldots, U\})[p | M]\}$

For instance, a bounded-version could ask whether integer $M$ is produced in the output sequence of a Fractran program within certain number of steps (divisions) (i.e. $M=N_j * F_i$ ).

An NP-complete variant of factoring.

3 Answers3

Linked