What are some natural problems that we can quickly find a solution to using massive parallelism but not a canonical solution?

Question

For many problems, more than one output is acceptable. For instance, the problem of finding an assignment that satisfies a boolean formula. If randomness buys us something then it could be that it is fundamentally easier to compute a solution to such a problem than to compute a canonical solution. Some non-deterministic algorithm would outperform every deterministic algorithm. However, it is likely that randomness does not buy much. Parallelism, on the other hand, certainly does improve the running time of algorithms. In general, if we have lots of processors but communication is expensive and synchronization is infeasible, we might expect non-deterministic algorithms to outperform deterministic ones. However, in many cases it seems possible to convert non-deterministic algorithms into deterministic ones under reasonable assumptions. For instance, one way to look for a satisfying assignment is to divide up the search space between multiple processors and then return the first solution that comes back. That's non-deterministic, but it can easily be made deterministic by putting an appropriate order on the solution space and then waiting after receiving the first solution until one can be sure that no smaller solution exists. Are there some natural examples of problems where such a conversion does not work? I.e., problems that we can expect to solve efficiently using parallelism but where we cannot expect to find canonical solutions using parallelism?

score 0 · Answer 1 · edited Apr 13 '17 at 12:32

0

I think you're mixing up parallelism with non-determinism. The thing about parallelism is that any problem that can be solved in polynomial time using a polynomial number of processors can be also be solved in polynomial time using a single processor, simply by simulating each processor's work one at a time.

NC is the class of problems that can be sped up through parallelism using a polynomial number of processors, and whether or not $NC=P$ is an open problem (whether or not all problems that can be solved in polynomial time can be sped up through parallelism). We know that NC is within P because of the previous paragraph, however if you're looking for problems that probably can't be sped up through parallelism (I think this is what you're asking for?), see P-Complete problems - these can't be sped up through parallelism unless $NC=P$.

On the other hand, non-determinism is talking about more generally what happens if we run all paths in parallel and return "yes" if at least one parallel processor found an answer that works. For NP-Complete problems, we're pretty sure that this requires an exponential number of processors to solve them in polynomial time, and an exponential amount of time to solve sequentially ("canonically"?) as well. In the example of subset sum (a known NP-Complete problem), you'd run one processor for each subset, and if any processors find a sum of 0 it would return it and you'd return yes, otherwise if none do you'd return no.

Whether non-determinism helps us or not in problems that can be verified in polynomial time is the problem of whether or not $P=NP$, and we're almost certain $P\neq NP$, it's just not proved yet so we're not 100% sure. Still there's many lesser assumptions that we're also pretty sure are true that would have to be broken first (for example collapse of the polynomial hierarchy, non-existence of one way functions/cryptography/pseudorandom generators).

Whether or not $NEXPTIME=EXPTIME$ is the exponential equivalent of this question, and by a padding argument we know that if $P=NP$, $NEXPTIME=EXPTIME$.

Finally, you mentioned that "it is likely that randomness doesn't buy much." Formally BPP is the class of problems that can be solved efficiently with a randomized algorithm, and you're right that we suspect $BPP=P$, but that's also something we haven't yet formally proved. Also see this and this question.

edited Apr 13 '17 at 12:32

Community

1

answered Oct 28 '14 at 20:54

Phylliida

1,132
5
22

I already know all of the things you say in this answer, but I don't see how they respond to my question. One thing that might conceivably be confusing you is that I am using "nondeterministic" in the sense it is used in distributed computing; I certainly don't consider that if a problem can be solved in X time by a non-deterministic Turing machine then it can be solved in X time using a non-deterministic algorithm! Also, I wasn't identifying polynomial time with efficient; for large inputs in distributed computing, quadratic complexity would not be feasible. – gmr Oct 29 '14 at 01:50
"Canonical" just means that the algorithm as a whole is deterministic (though components of it may be non-deterministic): on the same input it will always produce the same solution, regardless of random delays that may affect the system. Sorry if I am using some wrong terminology here, please correct me! – gmr Oct 29 '14 at 01:53
Okay. Sorry about that. Specifically, you have randomness and non-determinism. These are two very separate things. Can you phrase your question in terms of those? – Phylliida Oct 29 '14 at 03:25
Non-determinism is talking about parallelism - specifically if we run many processors in parallel to get our answer. It is still deterministic in the sense that it always returns the same answer.
Randomness, on the other hand, is talking about when you run an algorithm that has different answers each time, according to some probability distribution. Most random algorithms return the right result with high probability though, so they are just as many times as we need to get the right result with usually as high of a probability as we want.
– Phylliida Oct 29 '14 at 03:26
Also, the reason we use polynomial time is because - almost always - once a polynomial time algorithm for a problem is found, we can optimize it by continuing to tweak it until it is fast enough for use in the industry.
We also have things like average or worse runtime complexity so we can meaningfully discuss how, for example, many NP-Complete problems can be solved in polynomial time on average, but the best known algorithms still take exponential time on some instances. Or with other problems like factoring average-case complexity is also exponential as far as we know.
– Phylliida Oct 29 '14 at 03:30
Finally, since any system can be simulated (besides maybe quantum ones) efficiently by a turing machine, we rarely worry about those random delays in specific implementations because abstracting them away simplifies things so it's easier to analyze. If you are worried about them this may not be the best place to ask your question because it's not on topic as being a research-level question in the field of theoretical computer science. CS Stack Exchange or even Stack Overflow might get you better responses if that's the case. – Phylliida Oct 29 '14 at 03:35
You might be right that I chose the wrong forum for this question, but I also failed to express it in a way that people understood, so maybe I should try to fix that before reasking someplace else. You write "Non-determinism is talking about parallelism - specifically if we run many processors in parallel to get our answer." That is not what I mean by non-determinism (and I would argue that NP and similar classes are badly named). Non-determinism has nothing intrinsically to do with parallelism. You can think of a non-deterministic Turing machine in the definition of NP as one that – gmr Oct 29 '14 at 16:09
can sometimes go different ways and magically makes the best choice at every such choice point. Non-determinism in a more normal sense would mean making a random choice at such choice points, or possibly even a choice controlled by an adversary. In a parallel system, the "choice" might be how long to delay before moving on. The final output of such a system will in general itself be non-deterministic, but it is possible to design algorithms that run on such systems but are deterministic, for instance using concurrency control techniques. – gmr Oct 29 '14 at 16:15
It's not the case that polynomial algorithms can always be made fast enough for industry. For instance, regular expressions of a fixed query length can be matched against a corpus in time linear in the length of the corpus. But that doesn't mean Google can let you use regular expressions in your searches. The linear time algorithm would require processing the entire web on each search. Since there are many searches, that is infeasible. Even for one search, it would be infeasible if the algorithm were quadratic instead of linear. – gmr Oct 29 '14 at 16:31
I guess it's true that which polynomial time it is eventually matters, just normally we worry about exponential time because that matters more (almost noone will use an algorithm that's 2^n on the average case, while some people will be okay with a quadratic algorithm if it solves the problem they need and it doesn't need to be done in realtime). – Phylliida Oct 29 '14 at 19:39
Either way, if you read through the first chapter in Sipser's text on deterministic finite automaton and non-deterministic finite automaton, you'll see that the meaning of non-determinism is well defined, even if it's not what you'd prefer. – Phylliida Oct 29 '14 at 19:42
You're right that you can also even think of it as "magically choosing the right answer," just that is also equivalent to trying all answers in parallel and picking the right one. Though maybe what's bothering you is that that can be an exponential number of parallel processors (or even more) so it's not really meaningful to talk about that. If so that's fair. In general I agree that it might have made more sense to have non-determinism mean randomness, but that's not what it means, randomness means a very separate thing, and you using it incorrectly made your question hard to understand. – Phylliida Oct 29 '14 at 19:44
So in the context of these terms meaning what theoretical computer scientists have currently defined them as, what is your question? I would like to understand I just don't yet. – Phylliida Oct 29 '14 at 19:46
Consider the following problem: output a 1 (decimal) digit prime number. One correct solution is to generate random 1 digit numbers and output the first one that is prime. This solution will produce different outputs on different runs. But it is easy to modify the solution so that it doesn't have that property. For instance, you could generate the 1 digit numbers in order starting with 1, and then the output would always be 2. I was calling the first solution a "non-deterministic" algorithm and the second one a "deterministic" algorithm. Note that a deterministic algorithm can use – gmr Oct 29 '14 at 21:25
randomness. Given that pseudo-random number generator probably work about as well as true random number generators, it is plausible that if all randomness in the computation mechanism is under the control of the system designer, non-deterministic algorithms can always be made deterministic. In some parallel systems, the system can behave non-deterministically because of different processor speeds or random communication delays. (And this use of the term "non-deterministic" is totally standard.) I was interested in computational problems (like finding a satisfying assignment for a boolean – gmr Oct 29 '14 at 21:39
formula) where there seems to be a non-deterministic algorithm running on such a system that is much faster, for the same number of processors, than any deterministic algorithm. The boolean formula satisfying assignment problem isn't an example since even though the simplest algorithm is non-deterministic, it is easy to make a deterministic one with just a little overhead. I know my question is not fully precise since I have not specified the model exactly and said what counts as "much faster," but I wanted to leave room for different interesting results. – gmr Oct 29 '14 at 21:48
Okay, so I can see two questions in that: Are there any natural problems that we don't know how to speed up through parallelism, and are there any natural problems that we don't know how to solve quickly using a deterministic algorithm (with no randomness), but that we can solve quickly with randomness? Does that capture what you're asking? – Phylliida Oct 30 '14 at 01:18
Because the answer to the first one would be P-Complete problems, and the answer to the second one would be Polynomial Identity Testing and related equivalent problems, if it helps. – Phylliida Oct 30 '14 at 01:20
Also, as you said and is explained here, the existence of a good enough pseudo-random number generator does imply that we can convert any "random" algorithm into a deterministic one without randomness, similar to the way you converted finding a single digit prime number probabilistically to finding a single digit prime number without randomness. We don't have one yet through.
Finally I don't think the unpredictability in multithreading can really be used in any helpful way since it would probably act as a worse pseudo random number generator others in use.
– Phylliida Oct 30 '14 at 01:30
You are still not understanding my question. The reason my question brings in randomness is not that it has anything to do with randomness possibly being useful. Really the issue is not randomness but non-determinism, but if a system is non-deterministic, you can typically model it using probability theory. The point is that you can design algorithms that have deterministic outputs despite the unpredictability in multithreading. For algorithms that work by searching a space of possible solutions, sending different parts of the search space to different processors, it is simple to make – gmr Oct 30 '14 at 05:52
them deterministic. For problems that have a binary output like P-Complete problems or Polynomial Identity Testing, it is not even possible to have non-deterministic behavior since there is only one right answer. But maybe there are problems that have many right answers, like outputting a satisfying assignment, and yet you can't trivially make them deterministic. – gmr Oct 30 '14 at 05:56
Polynomial Identity Testing actually can only be solved using non-deterministic behavior - the algorithms just output the wrong result with some probability < 0.5, so if you run it over and over eventually you'll get the right answer with high probability. But it's still very "non-deterministic" using your definition of non-deterministic. – Phylliida Oct 30 '14 at 17:29
And the thing is that any algorithm that gives you the right answer using multithreading can also be made trivially deterministic, since multithreading is deterministic in the sense that it's just a program (the operating system) running other programs's instructions in turn according to some priority - fundamentally if you understood what all programs were doing you would know the exact result of the system every time. Alternatively you can just run each parallel thread in a random order, though that does require you to use a pseudo-random number generator to get similar results (or some – Phylliida Oct 30 '14 at 17:32
other source of randomness), but as I was saying chances are doing that is going to give you a better random distribution of correct results (for a problem that has multiple correct solutions) than just running lots of threads and returning the first correct result. – Phylliida Oct 30 '14 at 17:37
Either way sorry if I still didn't get what you meant though. So are you looking for a natural problem that can't be solved without randomness, but has multiple satisfying answers? – Phylliida Oct 30 '14 at 17:38
Because any problem with multiple instances but only one answer for each instance can be extended to fit that requirement. Simply define the new (just as natural) problem as, given a set of instances of, say, polynomial identity testing, return true if any of them are equal to 0, otherwise return false. – Phylliida Oct 30 '14 at 21:30
Now we can solve this problem "non-deterministically" I think like you wanted by just solving each one in parallel, then the first thread that finds a polynomial equal to 0 will just return it and you'd be done, probably after halting the other threads. And if all return and none have been given a polynomial that equals 0, return false. The same kind of thing can be done for P-Complete problems, NP-Complete problems, etc. – Phylliida Oct 30 '14 at 21:31
Maybe this will help. We can think of a problem with multiple correct answers abstractly as a function f from N to the powerset of N. For instance, given Godel numberings of the boolean formulas over a countable alphabet and the boolean assignments over that alphabet, we could define f(n) to be the set of Godel numbers of assignments satisfying the formula with Godel number n. Then we can consider two different problems that a solver Alice could be asked to solve. Problem one: Alice announces a formula of Peano arithmetic phi(n,m) satisfying for all n there exists a unique m such that – gmr Oct 30 '14 at 22:00
phi(n,m) and for all n,m phi(n,m) implies m in f(n). Then a challenger Bob issues a challenge n. Then Alice produces the m such that phi(n,m). Problem two: Bob issues a challenge n. Then Alice produces an m in f(n). The question is about cases in which the first problem is significantly harder than the second. Clearly they do not arise if Alice's computational powers can be modeled using a Turing machine, and if plausible conjectures are true they do not arise if Alice's computational powers can be modeled using a Turing machine with access to a random oracle either. But if Alice's – gmr Oct 30 '14 at 22:05
computational resources are like Google's (i.e., a large distributed system that has great aggregate power but where machines can fail, messages can be delayed, etc.) then matters are not so clear. Maybe Google's parallelism could avail it nothing in its attempts to solve problem A (just like for P complete problems) but be extremely useful in solving problem B. – gmr Oct 30 '14 at 22:08
In other words (sorry just in words that I'm more familiar with because I don't have a very deep understanding of Godel numbering and Peano arithmetic), let N be the set of boolean formulas with multiple satisfying variable arguments, and let M be the set of variable assignments to those formulas (that may or may not satisfy any given formula $n\in N$). – Phylliida Oct 31 '14 at 00:56
Let phi(n, m) be the "restricting" function such that, given any $n\in N$ and $m\in M$, only returns true if m satisfies n, and additionally only returns true for one of those satisfying variable assignments. – Phylliida Oct 31 '14 at 00:56
Let problem A be the problem of Alice giving a restricting function phi to Bob, Bob returning an $n\in N$, and Alice trying to find an $m\in M$ that causes phi(n, m) to evaluate to true. – Phylliida Oct 31 '14 at 00:57
Let problem B be the problem of Bob simply giving Alice an $n\in N$, and Alice finding a $m\in M$ that satisfies that n. – Phylliida Oct 31 '14 at 00:58
Your question is about problems like this, specifically in cases where A is significantly more difficult to solve than B for some reason under some computational model: Turing machines, DFAs, massively parallel systems like google, etc. all being examples of differing relevant computational models. Does that capture the topic correctly? (without addressing what you're asking about that because I want to make sure I understand the topic first) – Phylliida Oct 31 '14 at 01:01
OK, sorry I guess the jargon was not helpful; let me try to say the same thing with less jargon. In like complexity 101, I guess, you learn to think of a problem as a "language" over a finite alphabet (a set of strings over that alphabet). With problems that have multiple solutions, you should instead think of a problem as a function f that takes a string to a set of strings, the set of solutions. For our purposes, we may assume that for every string s, f(s) is non-empty. One problem is: given s, produce a member of f(s). Another problems is: first, give a definition of a function g that – gmr Oct 31 '14 at 05:00
takes each string s to a member of f(s). (You don't have to give an efficient algorithm for computing g at this stage or even any algorithm at all.) But then, given s, you have to produce g(s). (So in fact you'd better choose a g that you are capable of computing under the constraints you are subject to!) The question is: are there some natural problems f and some natural complexity classes such that the first problem is in the class but the second problem is not. For multivalue versions of the most commonly discussed complexity classes, I think I can see why the answer is probably no. – gmr Oct 31 '14 at 05:17
But I thought that maybe in a model that it is supposed to represent what can be done using distributed computing (and unfortunately I don't even know what formal models are used for this) the answer might be yes. Or at least I don't see why it couldn't be. In fact, the most obvious algorithms for many problems in distributed computing can give different outputs on different runs for the same input, so they don't solve the second problem. But often it looks like you can modify these algorithms in a way that doesn't slow them down much and makes them always give the same output on the same – gmr Oct 31 '14 at 05:24
input. (And always giving the same output on the same input -- what I was calling being "deterministic" -- is really equivalent to working as a solution to the "harder" problem. But I thought since my old way of explaining the question wasn't working, this new way using the idea of a requiring that the algorithm give a specification of what it will do in advance might help.) – gmr Oct 31 '14 at 05:27
The question seems to me like a pretty natural one though it might not be very deep (the answer might be obvious to an expert and sort of trivial). I didn't guess it would be so hard to explain the question in a way that people would understand. – gmr Oct 31 '14 at 05:32
Okay, that makes sense. I'd argue the three models are equivalent if mine was sufficiently generalized past just boolean circuits, but anyway, so I guess there's a few answers to your questions. – Phylliida Oct 31 '14 at 19:44
1: I'd argue that even with a turing machine, there are cases where the complexity class of the first problem is very different than the complexity class of the second problem. For example - if we're looking for any path in a graph that goes through all nodes with a turing machine (but may go through a node twice), the second problem of finding any path can be solved in linear time in the number of nodes, assuming it's possible, while if the first one also restricts us to a specific path that don't go through the same node twice it becomes NP-Complete, – Phylliida Oct 31 '14 at 19:49
if not more difficult because it's looking for a specific answer, where the easier problem of finding if such a path exists is already NP-Complete. – Phylliida Oct 31 '14 at 19:52
2: Distributed computing has no inherit benefit because of those imperfections. Specifically multithreading race conditions are the bane of many programmers, not something that makes it easier to solve problems. And if commputers sometimes fail they need to error correct for that - they can't somehow use that failure to help them better solve problems. And again, any randomness made from those events has such biased/unknown probabilities that replacing that source of randomness with a pseudo-random number generator will always give better results. – Phylliida Oct 31 '14 at 19:54
Furthermore, it's a founding principle of computer science that anything that can be computed by any computing device (including massive parallism), can also be computed by a turing machine. And even in the parallel case, it's only a polynomial speedup. In the specific case you're thinking about, where google would vastly distribute it's resources to solving a specific problem, then once one processor found a solution they could be done, that can also be simulated by a turing machine. Lets say we have n processors, each with m space. – Phylliida Oct 31 '14 at 19:58
The turing machine can either simulate them all one at a time, or - what would be more helpful - if it has m*n space it can simply take turns running each processor for one instruction, then running each processor for the next instruction, ..., basically simulating all processors simultaneously. This has the advantage of being able to return the right result once any processor has that result, with a slowdown of taking n times as much time, which is still a polynomial slowdown. – Phylliida Oct 31 '14 at 20:00
However, you're right that if you have billions of processors/threads, this "polynomial slowdown" of taking a billion times as long is pretty significant. However, as I pointed out before, there is no inherent benefit in any imperfections that model has, thus when we're designing algorithms for it/considering what problems it can help with, we assume the engineers of the system made enough error-correcting techniques to allow us to assume those imperfections don't exist. That instead we are just working with n parallel threads. – Phylliida Oct 31 '14 at 20:02
The formal model for a parallel computer is a Multitape turing machine. This is what we used to study the distinction between problems that can be parallelized effectively, and problems that can't be. Anything that google or any other massively parallel system can do, a multitape turing machine with that many tapes can do as well. You could probably combine a RAM with a multi-tape turing machine if you wanted the most accurate model, but a multi-tape turing machine is good enough. – Phylliida Oct 31 '14 at 20:05
Back to your question then, because your easier problem of finding any solution can be much easier than the hard problem, and this is often the case (another example would be the bin packing problem, finding a specific packing that's close to optimal within some $\epsilon$ is very easy, while finding the optimal packing is NP-Hard for many instances of the problem), the situation doesn't really change with parallel computers. The easy problem is still easy, and the hard problem is still hard. – Phylliida Oct 31 '14 at 20:13
However, this is probably where it's relevant to discuss the multi-tape turing machine model, because the easier one can often be made easier using a multi-tape turing machine. However, once you make the hard one as hard as an NP-Hard or EXP-complete problem, massively distributed computers (as far as we know) aren't gonna help you any more than a single computer can, because both take exponential time (with NP-Complete problems we suspect it does at least). – Phylliida Oct 31 '14 at 20:16
It is more processing power, so for many problems we can solve more instances using a massively parallel system than a single processor, and we do this and it's been studied in depth. This is where P-Complete problems came from because they're annoyingly hard to solve better with a multi-tape turing machine. However even for problems a massively distributed aka very large multi-tape turing machine is helpful for, a single processor that's much much faster than any of those single distributed processors could technically do just as much. We see this in graphics card programming - it's really – Phylliida Oct 31 '14 at 20:19
hard to use OpenCL to solve a problem better than a highly optimized CPU version could, but when we can with like scan or displaying graphics or etc. it is very helpful. – Phylliida Oct 31 '14 at 20:20
So you are right that massively distributed systems with say a trillion processors do help with solving some problems, and that's why they exist is to do exactly that, however fundamentally they are no different than a multi-tape turing machine with (again for example) a trillion tapes, and they can both solve problems in the exact same amount of time, plus or minus some time that's different per system, but asymptotically this difference is negligible compared to, say, a constant time algorithm vs a linear time algorithm. – Phylliida Oct 31 '14 at 20:25
Where if the massively-parrallel one can solve a problem in $\mathcal{O}(f(n))$ time, a multi-tape turing machine can also solve the same problem in $\mathcal{O}(f(n))$ time. It might actually be like $\mathcal{O}(f(n)*n)$ or something, but if that's really a big deal we can use a multi-tape ram model instead and then it'll be exactly $\mathcal{O}(f(n))$ time. – Phylliida Oct 31 '14 at 20:31
Thanks for sticking with this question. I am going to be traveling for a few days, so I'm not sure how responsive I can be, and I can't respond now to everything you have written. But I think you are still misunderstanding the question if that is believable. Specifically, your point 1 misunderstands the question. If f takes a graph to the set of all paths through all nodes allowing repeated vertices then both problems can be solved in polynomial time. If f takes a graph to the smaller set of all paths through all nodes not allowing repeated vertices then both problems are (the multivalue – gmr Oct 31 '14 at 23:20
generalization of) NP complete. In a deterministic model of computation, whether parallel or not, these two problem classes can't come apart because the program itself can serve as a definition of g. Your point 2 is also misunderstanding the question. Just as you say, "multithreading race conditions are the bane of many programmers, not something that makes it easier to solve problems." The point is not that the existence of these race conditions makes the easy problem easier but that it makes the hard problem harder. And a multitape Turing machine is not a good model of a distributed – gmr Oct 31 '14 at 23:26
system in part because it is deterministic. (Actually, even within the class of non-deterministic models, there would be various assumptions you can make. For instance, one way to make things hard on yourself is to assume that some fraction of the processors will be controlled by an adversary who is actively trying to mess you up.) – gmr Oct 31 '14 at 23:33
Honestly I might be wasting your time. I really wanted to help but we're just not getting anywhere, sorry. – Phylliida Nov 01 '14 at 00:22
Probably the takeaway from this though is that in TCS we try and study abstract well defined concepts that can hopefully be applied to reality, not systems in reality in the exact details they exist in. You can feel free to add as much complexity to a model as you'd like to make it more like the real one, but it just makes it more difficult to study, and eventually you might as well just study computers and that's the field of CS instead. There's a lot of overlap but fundamentally we study different things. – Phylliida Nov 01 '14 at 00:46
Yeah, I guess I wasn't totally clear about the difference before posting. – gmr Nov 01 '14 at 00:59
Cool. You could always try cs.stackexchange.com though, they might be more helpful. Either way you have a good day. Also take much of what I said with a grain of salt - I'm only a 3rd year undergraduate student so the extent that I can describe the TCS community's goals and motives is debatable. – Phylliida Nov 07 '14 at 02:39

What are some natural problems that we can quickly find a solution to using massive parallelism but not a canonical solution?

1 Answers1