Justifying asymptotic worst-case analysis to scientists

Question

I've been working on on introducing some results from computational complexity into theoretical biology, especially evolution & ecology, with the goal of being interesting/useful to biologists. One of the biggest difficulties I've faced is in justifying the usefulness of asymptotic worst-case analysis for lower bounds. Are there any article length references that justify lower bounds and asymptotic worst-case analysis to a scientific audience?

I am really looking for a good reference that I can defer to in my writing instead of having to go through the justifications in the limited space I have available (since that is not the central point of the article). I am also aware of other kinds and paradigms of analysis, so I am not seeking a reference that says worst-case is the "best" analysis (since there are settings when it very much isn't), but that it isn't completeletely useless: it can still gives us theoretically useful insights into the behavior of actual algorithms on actual inputs. It is also important the writing is targeted at general scientists and not just engineers, mathematicians, or computer scientists.

As an example, Tim Roughgarden's essay introducing complexity theory to economists is on the right track for what I want. However, only sections 1 and 2 are relevant (the rest is too economics specific) and the intended audience is a bit more comfortable with theorem-lemma-proof thinking than most scientists^[1].

Details

In the context of adaptive dynamics in evolution, I've met two specific types of resistance from theoretical biologists:

[A] "Why should I care about behavior for arbitrary $n$? I already know that the genome has $n = 3*10^9$ base pairs (or maybe $n = 2*10^4$ genes) and no more."

This is relatively easy to brush-off with the the argument of "we can imagine waiting for $10^9$ seconds, but not $2^{10^9}$". But, a more intricate argument might say that "sure, you say you care about only a specific $n$, but your theories never use this fact, they just use that it is large but finite, and it is your theory that we are studying with asymptotic analysis".

[B] "But you only showed that this is hard by building this specific landscape with these gadgets. Why should I care about this instead of the average?"

This is a more difficult critique to address, because a lot of the tools people commonly use in this field are coming from statistical physics where it is often safe to assume a uniform (or other specific simple) distribution. But biology is "physics with history" and almost everything isn't at equilibrium or 'typical', and empirical knowledge is insufficient to justify assumptions about distributions over input. In other words, I want an argument similar to that used against uniform distribution average-case analysis in software engineering: "we model the algorithm, we can't construct a reasonable model of how the user will interact with the algorithm or what their distribuition of inputs will be; that is for psychologists or end users, not us." Except in this case, the science isn't at a position where the equivalent of 'psychologists or end users' exists to figure out the underlying distributions (or if that is even meaningful).

Notes and related questions

The link discusses cognitive sciences, but the mindset is similar in biology. If your browse through Evolution or Journal of Theoretical Biology, you will seldom see theorem-lemma-proof, and when you do it will typically be just a calculation instead of something like an existence proof or intricate construction.
Paradigms for complexity analysis of algorithms
Other kinds of running time analysis besides worst-case, average-case, etc?
Ecology and evolution through the algorithmic lens
Why economists should care about computational complexity

Worst-case behavior is impossible to justify … the simplex algorithm has exponential worse-case behavior, and the only people who have ever cared are the theorists. What you need to argue is (a) the average-case asymptotic behavior is important; (b) the average-case asymptotic behavior and worst-case asymptotic behavior are quite often similar; (c) the worst-case asymptotic behavior is often much easier to calculate than the average-case asymptotic behavior (especially since nobody knows what the relevant probability distribution is). — Peter Shor, Jan 25 '14 at 23:15
@PeterShor average-case becomes very weird for natural systems when we don't really give them inputs. For instance, if we are trying to argue that the Arrow-Debrue model is not the full story, we can say that using it (without further restrictions on the structure of markets considered) clashes with the popular assumption of "relatively quick" market equilibration. How can we possibly argue for a distribution over markets here? Aren't we just doing a philosophical analysis of if our assumptions fit? Or is such analysis of no interest? — Artem Kaznatcheev, Jan 25 '14 at 23:23
@PeterShor any tips/advice on arguing part (b) in your recipe? Or is this completely problem/context specific? — Artem Kaznatcheev, Jan 25 '14 at 23:26
As Peter says, it's very difficult to justify worst-case analysis. A much "easier" explanation is to demonstrate that an algorithm with good worst-case guarantees is in fact better in practice. A set of such cases is more compelling than any argument from principle. — Suresh Venkat, Jan 26 '14 at 00:22
@SureshVenkat but how do we do that when what we care about is lower bounds? In particular, my urge for using complexity arguments in fields like biology is specifically because we currently have no way to argue about certain microdynamics and thus no access to specific algorithms. — Artem Kaznatcheev, Jan 26 '14 at 00:27
@Artem: I know a little bit about biology … it would really help me if you gave some specific examples of the kind of thing you are talking about. — Peter Shor, Jan 26 '14 at 00:28
@PeterShor To shamelessly push my own preprint: there the goal is to use very simple TCS to point out that "being at a (local) fitness peak" is an unreasonable assumption and should not be so freely made, contributing to a discussion in biology spurred by the recent results from the Lenski's LTEE. To avoid being too chatty, I'd be happy to continue on this blog post or by email. — Artem Kaznatcheev, Jan 26 '14 at 00:47
Asymptotics is already a problematic aspect. We all know the story about matrix multiplication algorithms (asymptotic upper bounds are meaningless in practice), and perhaps also the story about choosing parameters in cryptography (asymptotic lower bounds are meaningless in practice; exponential algorithms are sometimes feasible [DES]). If your analysis has actual constants then it is more convincing. — Yuval Filmus, Jan 26 '14 at 01:56
I agree with everything that's been said so far, but want to add one point about [B]. In some sense, your response should be: actually, we don't care about the worst-case or the average-case, because (as you say) biology is typically not at equilibrium. The issue is that no one has a good model that produces something which "looks like" the real-world case. This is the source of a lot of work on generative models of networks that "look like" real-world networks. As to why worst-case, I'd stick with Peter Shor's argument, especially (c) - often worst-case analysis is the best we can do. — Joshua Grochow, Jan 26 '14 at 03:38
If you think about computation as a game (i.e., war) between the input provider and the algorithm, then the worst case analysis is a standard military approach - you want to know how bad can it be. Secondly, and more importantly, worst case analysis does not allow you to be intellectually lazy and accept solutions that might be good to what you believe the world is (and not what the world is really is). Finally, and maybe most importantly, it provide a uniform way to compare algorithms in a hopefully meaningful way. In short, it is the worst approach, except for all the others. — Sariel Har-Peled, Jan 26 '14 at 03:46
@SarielHar-Peled I like the "does not allow you to be intellectually lazy" argument, but for lower bounds people can just argue "you are just picking this family of instances to prove your point, and is not a good description of how 'real' instances 'really' are", so it would have to be given more carefully to avoid this response. — Artem Kaznatcheev, Jan 26 '14 at 14:04
@YuvalFilmus I agree that it is better to keep constants around, but sometimes (I would even say, oftentimes for lower-bounds) it is impossible. A classic example would be that additive constants are meaningless in Kolmogorov complexity. In my specific case, I am using TCS because it allow making minimal assumptions about underlying dynamics. i.e. I can't specify my model of computation (that would be assuming to much on the microdynamics) beyond saying it is 'reasonable' (for instance, doesn't grow resources super-polynomially) and that makes constants (and sometimes larger terms) meaningless — Artem Kaznatcheev, Jan 26 '14 at 14:12
@JoshuaGrochow I agree very much. I actually use the unreasonableness of the worst-case results to argue that we need to think about a particular problem away from equilibrium. I also like (c) as a justification for worst-case analysis in TCS, but in biology I actually make a (c)-like argument against using the more common equilibrium analysis, and so my hands are a bit tied. — Artem Kaznatcheev, Jan 26 '14 at 14:25
@SarielHar-Peled: +1 for "Worst-case analysis is the worst approach, except for all the others" :). — Joshua Grochow, Jan 26 '14 at 23:10
Worst case analysis is as useless as frequentist statistics. It answers a question no one has asked you (what is the worst possible scaling of your algorithm over all possible inputs and an unbounded input size) just as frequentist stats answers the question "What is the probability of the data given the model?" when you clearly want to know the probability of the model given the data (i.e. is he guilty?). However both have turned out to be quite useful... — Simd, Jan 27 '14 at 10:17
I think a worst-case lower bound should be seen as putting the ball back in their court. You have shown that there is no algorithm that can solve their problem on all instances in a reasonable time frame. They may reasonably believe that their instances are easy -- but you have just shown that if this is so, it is a non-trivial fact. Their model is therefore incomplete unless they come up with an explanation for why this is so. — Aaron Roth, Jan 28 '14 at 21:59
(This is the approach that seems to work when talking to game theorists. It raises the question -- if markets truly equilibriate quickly -- what special property do real markets have that gets around worst case hardness? It is likely possible to define a plausible such property, and the lower bounds just suggest that doing so is an important research direction) — Aaron Roth, Jan 28 '14 at 22:01
@AaronRoth this is my thinking as well. Do you know a source outside of econ (or comp sci, or math) which expands on these ideas? — Artem Kaznatcheev, Jan 28 '14 at 23:32
Aaron makes a great point. However, I'm not sure arguments that work in the presence of markets, where there are incentives to seek out the extreme behaviours that maximize gains, apply also to biology. Are the dynamics really comparable? — András Salamon, Jan 30 '14 at 20:03
One of the reasons worst case analysis has persisted so widely is that it gives a way of comparing academics, not just algorithms. Everyone likes a clear set of rules for a game and this one lets me publish papers saying my algorithm is better than yours in an objectively testable way. This motivation should not be underestimated. — , Jan 30 '14 at 20:13
@PeterShor, For simplex worse case I do not think that we can compare it with usual algorithms, simplex can solve some NP-Complete problems in its worst case scenario,so that worst case probably is not bad (it's meaningful), take a look at this paper: http://arxiv.org/abs/1311.5935. — Saeed, Jan 31 '14 at 09:50
@Saeed: For simplex simply viewed as an algorithm for solving linear programs, I think you can compare it with the usual algorithms; simplex doesn't solve NP-complete problems unless you pick a particular instantiation of the simplex algorithm and start worrying about its inner workings. — Peter Shor, Jan 31 '14 at 17:37
are you guys all overthinking this? worst case analysis is just a simple/basic mathematical property of algorithms. it is a fundamental conceptual starting/launching point for understanding more subtle nuances of performance such as how average case behaves (which is more based on statistics). worst case analysis focuses by nature on extremes. as for significance of average vs worst case how about the toy/universal model of sorting algorithms.... & comparing bubble sort vs quicksort etcetera.... — vzn, Jan 31 '14 at 17:55
another field where there are good examples/"justification" is *cryptography* where the security of the algorithm *hinges* on the distinction. a supposed trap door function which is hard in worst case but easy in average case is *worthless/insecure* etc.... — vzn, Jan 31 '14 at 18:01
@vzn thank you for the feedback, the interest is sciences that are not directly related to computing. Hence, cryptography is not relevant to the discussion. — Artem Kaznatcheev, Jan 31 '14 at 18:04
@AndrásSalamon you make a great point about markets potentially 'seeking' worst case behaviour. However, it isn't obvious to me that this is irrelevant for biology. The key feature there is historicity, so even if we can't make an argument for seeking the worst-case, we also can't make an argument for average case. — Artem Kaznatcheev, Jan 31 '14 at 18:06
the [A]/[B] objections given by biologists listed in the question are interesting but seem almost to be maybe pushing against too much TCS abstraction introduced into a field of more physical/concrete analysis.. and too much abstraction standing in way of applicability is indeed a legitimate complaint of practitioners to much of TCS research! it would be helpful if you could quote more at length the "objections" & fill out more context.... — vzn, Jan 31 '14 at 18:12
@PeterShor, I think if you didn't look at my referenced paper, is not bad to look at it's introduction and some definitions and theorems without going in details. They prove that simplex implicitly solves np-complete problems, the definitions are somehow technical but in my humble opinion their argument is reasonable, that means we can classify algorithms in another point of view, in this classification simplex is an algorithm with some (magical) power (in its worst case scenario). What I want to say is that, it's not a good way to compare simplex with ordinary algorithms. — Saeed, Feb 01 '14 at 00:13
how about as a "justification" (still on the highly questionable premise it needs one), goedel invented it in a letter to von neumann over ½-century ago? — vzn, Feb 01 '14 at 02:06
ran across this similar question on [cs.se]: Explaining the relevance of asymptotic complexity of algorithms to practice of designing algorithms — vzn, Feb 07 '14 at 16:45
Speed of light, $C$, is very important to them, see Grace Hopper's lecture. If you give them a lower bound on a distributed algorithm every message passed is going to incur a lower bound of $C * distance$ no matter how big their research budget is. — Chad Brewbaker, Jan 30 '14 at 21:40

score 7 · Answer 1 · answered Jan 30 '14 at 20:59

My personal (and biased) take is that asymptotic worst-case analysis is a historical stepping stone to more practically useful kinds of analysis. It therefore seems hard to justify to practitioners.

Proving bounds for the worst case is often easier than proving bounds for even "nice" definitions of average case. Asymptotic analysis is also often much easier than proving reasonably tight bounds. Worst-case asymptotic analysis is therefore a great place to start.

The work of Daniel Spielman and Shanghua Teng on smoothed analysis of Simplex seems to me a harbinger of what can happen when we start to gain a better understanding of the shape of a problem: tackling the worst-case first enables a more nuanced understanding to be developed. Further, as Aaron Roth suggested in the comments, if the "usual" behaviour of a system is significantly different from its worst case, then the system is not yet completely specified and more work is needed to improve the model. So going beyond worst-case generally seems important as a long-term goal.

As far as asymptotic analysis is concerned, it usually serves to keep a long and messy proof clear of distracting details. Unfortunately there currently does not seem to be a way to reward the tedious work of filling in the details to obtain the actual constants, so that seldom seems to get done. (Page limits also work against this.) Careful analysis of the details of an asymptotic bound has led to actual algorithms, with good bounds on the constants, so I personally would like to see more of this kind of work. Perhaps if more proofs were formalised using proof assistant systems, then the constants could be estimated with less additional effort. (Or bounds on the constants, along the lines of Gowers' bound for the Szemerédi Regularity Lemma, might become more routine.) There are also ways to prove lower bounds that are free of constants, by using more explicit machine models (such as deterministic finite-state automata). However, such (near-)exact lower bounds for more general models of computation may require a lot of work, or be out of reach altogether. This seems to have been pursued in ~1958–73 during the first heyday of automata theory, but as far as I can tell has since largely been left alone.

In short: in future, I think we'll see worst-case analysis as just a first step, and I hope we'll see expressions like $O$^*$(n^k)$ with galactic hidden constants as just the first step. I find it difficult to justify the current focus on worst-case asymptotic analysis to practitioners, but perhaps it is useful to see this as the beginning steps of a longer journey.

I don't share your enthusiasm for ditching asymptotics in favor of precise bounds with definite constants. Asymptotics may be imprecise -- but they're usefully imprecise. They abstract over implementation differences for the same machine model. E.g., a sorting algorithm that was quadratic on 1950s hardware will still be quadratic on today's hardware. Furthermore, asymptotic formulas compose nicely. Linears and polynomials are closed under composition, for instance. (Note that arguing for better bounds on the average case compared to worst case is orthogonal from arguing against asymptotics.) — brandjon, Jan 31 '14 at 01:05
You are right in general, but there is a big difference between a small constant and one that is a non-elementary function of a relevant parameter. — András Salamon, Jan 31 '14 at 01:13
I like this answer overall, but I agree with @brandjon that hiding constants is crucial. For me, the reason TCS is useful in biology is because it needs to make much fewer assumptions about micro dynamics than physics. However, if you don't make assumptions about the micro dynamics (I.e. the exact specification of the model of computation) then you can't get the constant factors out. The other useful feature of TCS is rigorous qualitative dichotomies (something that is easier to compare to the more qualitative observations in bio), usually to get these you also have to ditch constants. — Artem Kaznatcheev, Jan 31 '14 at 18:13
Hiding constants is OK up to a point. But when one reads of an algorithm that takes $\tilde{O}(n^{\tilde{O}(1/\epsilon)})$ time (as I often do) there is definitely something missing. Also, I am not entirely convinced that including constants in an analysis will typically require a detailed machine model. For example, the constants in sorting algorithms are now well known and this hasn't required some micro description of a particular machine. — , Jan 31 '14 at 19:59
@Erhart it required a pretty strong assumption about the microdynamics: that only comparisons are allowed. In an evolutionary biology setting, an assumption of that strength would be saying that only adaptive walks on point-mutations in asexual organisms are allowed (in this case we can also prove lower bounds where we know the constants). — Artem Kaznatcheev, Feb 01 '14 at 02:44
As a side note, there are examples where worst-case analysis makes sense. For example, when you develop a library of general-purpose subroutines and do not know in what application domains they will be useful: you cannot possibly anticipate all cases of when and why someone will want to compute a minimum cost bipartite matching, for example. Adversarial settings, such as cryptography, are even more clear cut (however, in crypto you'd really like to know the constants when it comes to security parameters). — Sasho Nikolov, Feb 03 '14 at 02:42

score 4 · Answer 2 · answered Jan 28 '14 at 17:01

Lower bounds and worst-case analysis don't usually go together. You don't say an algorithm will take at least exponential time in the worst case, therefore it's bad. You say it can take at most linear time in the worst case, and therefore is good. The former is only useful if you are going to run your algorithm on all possible inputs, and not merely an average input.

If you want to use lower bounds to demonstrate badness, then you want a best-case analysis or an average-case analysis. You can simplify things by relying on @PeterShor's point that worst and average are often very similar, and give a laundry list of algorithms for which this is true. (Ex: all the classic sorts besides quicksort.)

As for demonstrating that asymptotics matter when compared to constant inputs and constant factors, my favorite article on the topic is Jon Bentley's "Programming pearls: algorithm design techniques". He presents four different solutions to a simple array problem, and demonstrates how the linear approach annihilates the cubic one. He calls his table "The Tyranny of Asymptotics", after the term used by physicists for the intractability of the rocket equation. I use this example to motivate the search for better algorithms to pre-college students.

Will a non-computer scientist read through an article that contains code, and know to skip over the low-level details to get the big picture? I don't know. Perhaps there's a better presentation elsewhere. But I think this is a decent resource to cite.

And if they argue that they don't care about arbitrarily large n, have them run recursive un-memoized Fibonacci on 3 * 10⁹ base pairs, and tell them it's O(1) since the size of the DNA sequence is fixed. ;)

Re: your first paragraph: actually, that's almost exactly what a lot of complexity theory does. If a problem is EXP-complete, that means it requires exponential time on worst-case inputs. This is generally taken as an indication of its overall difficulty (which, to be fair, in practice is often not so bad as a general indicator). This is the de facto standard, called an "infinitely-often" or i.o. lower bound; getting average-case or almost-everywhere lower bounds (that is, for all but finitely many inputs) is a goal sometimes pursued, but often far out of reach compared to i.o. lower bounds. — Joshua Grochow, Jan 30 '14 at 21:15
Let me point out that not only can you give a laundry list of algorithms for which worst-case and average-case analysis are the same, but you can also give numerous examples where they are very different (the simplex algorithm just being the most famous of these). You really need to argue somehow that they are the same for your particular application; experimental testing is a good way to do this. — Peter Shor, Jan 30 '14 at 21:28
@JoshuaGrochow Fair enough. How about we revise the statement as follows: Lower bounds on worst-cases are important when you want to demonstrate the absence of a mathematical guarantee of non-horribleness. ;) — brandjon, Jan 31 '14 at 00:56

score -3 · Answer 3 · edited Jun 17 '20 at 09:38

-3

much agreed this an important topic to survey/cover but it seems not to have been much yet. a few refs of varying style/coverage/audience/ formality not exactly as requested but somewhat close (best seen online so far on medium search, hope to hear further of any better ones; more notes below):

The complexity of algorithms Atkinson (alas only a single ref to biology in the paper, but may suffice on more general science/engineering terms)

The modern theory of algorithms dates from the late 1960s when the method of asymptotic execution time measurement began to be used. It is argued that the subject has both an engineering and scientiﬁc wing. The engineering wing consists of well-understood design methodologies while the scientiﬁc wing is concerned with theoretical underpinnings. The key issues of both wings are surveyed. Finally some personal opinions on where the subject will go next are oﬀered.
Complexity and Algorithms J. Diaz. 100 slides. broad; one could excerpt relevant ones in particular.
A Gentle Introduction to Algorithm Complexity Analysis Dionysis "dionyziz" Zindros

in other words is there a sort of introduction/survey/overview of the complexity theoretic lens in close combination/conjunction/companion with the advancing algorithmic lens in science, something like "Complexity theory for scientists, engineers, and researchers"?

there are good refs on the former "algorithmic lens" you have cited eg Papadimitriou but it does not seem a highly satisfactory ref by an expert in the field has been written on the latter "complexity lens"... yet (maybe some "elite" member of this site will consider that as their next book or paper project).

note also there are a lot of refs on relevance P vs NP outside of complexity theory & in other scientific fields that could be used somewhat for these purpose. will add them in comments if there is any interest.

edited Jun 17 '20 at 09:38

Community

1

answered Jan 29 '14 at 19:40

vzn

11,014
2
31
64

3

I don't think this really answers the question. – Huck Bennett Jan 29 '14 at 20:16
1

uh huh, did you look at any of the refs? part my answer is that there is not (yet) any ideal/perfect answer :| – vzn Jan 29 '14 at 20:18
1

They seem to define asymptotic and worst case analysis rather than focusing on justifying it, but maybe I missed something? – Huck Bennett Jan 29 '14 at 20:21
seems to me all the refs have substantial-to-strong elements justifying the use of complexity theory (analysis) of algorithms in general outside TCS to scientists, engrs, researchers etc... wasnt that clear looking at them? (concede already it would indeed be better if something was written by a leader in the field...have not found that)... re asymptotic vs worst case, is the question really focused on that? cant tell ... maybe researchers outside the field are unlikely to consider this distinction as crucial/pivotal as those inside of it ... & likely too specialized to find that ref ... – vzn Jan 29 '14 at 21:35
7

Actually, I think researchers outside TCS could easily dismiss worst-case as "artificially constructed examples that would never occur in practice" and would be (without strong convincing otherwise) much more interested in average-case (despite the fact that it's not clear that the average-case is that much closer to real-world instances). – Joshua Grochow Jan 30 '14 at 02:59
seems to me for many algorithms & data that arise in practice (eg random instances of NP complete problems), average and worst case performance are related, so practitioners need to understand basic concepts of TCS (such as "easy" vs "hard" instances and "easy" vs "hard" problems and "inefficient" vs "efficient" algorithms & how these are scientifically quantified) but do not nec have to be so attentive to complexity-theoretic distinctions about performance that TCS researchers meticulously focus on. (and thought this pov is in line with the request in the original question, is it not?) – vzn Jan 30 '14 at 03:35
@joshua more thinking on this, dont understand why TCS theorists would find it nec to "justify" worst case analysis. the origins of this theory in Big-O notation (Laundau functions) are from earlier mathematics & even number theory. it is intrinsic to complexity theory, virtually a fundamental theorem that algorithms have complexity that can be measured. also that concept can be used for other than algorithm speed measurement, eg population growth etc.; outside of that there are subtleties/nuances that deserve further attn... – vzn Jan 31 '14 at 16:27
moreover if practitioners didnt care about worst case they would be using easier-implemented $O(n^2)$ bubble sort instead of $O(n \log n)$ quicksort where the latter is harder to understand/implement.... are we sort of getting wrapped around the axle here in this discussion....? – vzn Jan 31 '14 at 17:28
1

@vzn: Asymptotic (e.g. big-Oh) and worst-case are somewhat orthogonal. One can do asymptotic worst-case analysis, asymptotic average-case analysis, or even asymptotic easiest-case analysis (though I admit the latter seems somewhat perverse). One could instead do exact worst-case analysis, or exact average-case analysis, and so on, though these would be much more model-dependent and less robust. Justifying the use of asymptotics (and hiding things like constant factors) is entirely distinct from justifying worst-case vs average-case or "real-"case (whatever the latter might mean...). – Joshua Grochow Jan 31 '14 at 22:51

Justifying asymptotic worst-case analysis to scientists

Details

Notes and related questions

3 Answers3