19

I've never seen an algorithm with a log in the denominator before, and I'm wondering if there are any actually useful algorithms with this form?

I understand lots of things that might cause a log factor to be multiplied in the run time, e.g. sorting or tree based algorithms, but what could cause you to divide by a log factor?

Kaveh
  • 21,577
  • 8
  • 82
  • 183
Mike Izbicki
  • 1,073
  • 7
  • 15

11 Answers11

41

The usual answer to "what could cause you to divide by a log?" is a combination of two things:

  1. a model of computation in which constant time arithmetic operations on word-sized integers are allowed, but in which you want to be conservative about how long the words are, so you assume $O(\log n)$ bits per word (because any fewer than that and you couldn't even address all of memory, and also because algorithms that use table lookups would take too much time to set up the tables if the words were longer), and
  2. an algorithm that compresses the data by packing bits into words and then operates on the words.

I think there are many examples, but the classic example is the Four Russians Algorithm for longest common subsequences etc. It actually ends up being $O(n^2/\log^2 n)$, because it uses the bit-packing idea but then saves a second log factor using another idea: replacing blocks of $O(\log^2 n)$ bit operations by a single table lookup.

argentpepper
  • 2,281
  • 15
  • 27
David Eppstein
  • 50,990
  • 3
  • 170
  • 278
35

The Rubik's Cube is a very natural (and to me, unexpected) example. An $n\times n\times n$ cube requires $\Theta(n^2/\log n)$ steps to solve. (Note that this is theta notation, so that's a tight upper and lower bound).

This is shown in this paper [1].

It may be worth mentioning that the complexity of solving specific instances of the Rubik's cube is open, but conjectured to be NP-hard (discussed here for example) NP hard [2]. The $\Theta(n^2/\log n)$ algorithm guarantees a solution, and it guarantees that all solutions are asymptotically optimal, but it may not solve specific instances optimally. Your definition of useful may or may not apply here, as Rubik's cubes are generally not solved with this algorithm (Kociemba's algorithm is generally used for small cubes as it gives fast, optimal solutions in practice).

[1] Erik D. Demaine, Martin L. Demaine, Sarah Eisenstat, Anna Lubiw, and Andrew Winslow. Algorithms for Solving Rubik's Cubes. Proceedings of the 19th Annual European Symposium on Algorithms (ESA 2011), September 5–9, 2011, pages 689–700

[2] Erik D. Demaine, Sarah Eisenstat, and Mikhail Rudoy. Solving the Rubik's Cube Optimally is NP-complete. Proceedings of the 35th International Symposium on Theoretical Aspects of Computer Science (STACS 2018), February 28–March 3, 2018, pages 24:1-24:13.

SamM
  • 1,685
  • 2
  • 14
  • 21
16

An example of $\log n$ showing up in the denominator without bit packing tricks is a recent paper by Agarwal, Ben Avraham, Kaplan and Sharir on computing the discrete Fréchet distance between two polygonal chains in time $O(n^2\log\log n/\log n)$. While I'm not familiar with the details of the algorithm, one general trick is to partition the input into relatively small pieces and then combine the answers cleverly (of course this sounds like divide and conquer, but you don't get the log n in the denominator with some clever tricks)

Suresh Venkat
  • 32,071
  • 4
  • 95
  • 271
  • 5
    This is a more complex instance of the "Four Russians" technique described in David's answer. – Jeffε Feb 03 '13 at 21:00
13

Not exactly what you asked for, but a situation "in the wild" in which a log factor appears in the denominator is the paper "Pebbles and Branching Programs for Tree Evaluation" by Stephen Cook, Pierre McKenzie, Dustin Wehr, Mark Braverman, and Rahul Santhanam.

The tree evaluation problem (TEP) is: given a $d$-ary tree annotated with values in $\{1,\ldots,k\}$ on the leafs and functions $\{1,\ldots,k\}^d \to \{1,\ldots,k\}$ on internal nodes, evaluate the tree. Here each internal node gets the value of its annotated function on the values of its children. This is an easy problem, and the point is to show that it cannot be solved in logarithmic space (when the height of the tree is part of the input). To that effect, we are interested in the size of branching programs solving TEP.

In Section 5, tight bounds are presented for trees of height 3, both for TEP and for the related problem BEP, in which the output is collapsed to $\{0,1\}$ in some arbitrary way. For TEP the bound is $\Theta(k^{2d-1})$, while for BEP the bound is $\Theta(k^{2d-1}/\log k)$, i.e. you get a saving of $\log k$.

Yuval Filmus
  • 14,445
  • 1
  • 48
  • 92
13

There are two problems with tight query complexity $\Theta(n/\log n)$:

Thatchaphol
  • 1,130
  • 8
  • 18
12

Even though it's not about runtime, I thought it worth mentioning the classical result of Hopcroft, Paul, and Valiant: $\mathsf{TIME[t]} \subseteq \mathsf{SPACE}[t/\log t]$ [1], since it's still in the spirit of "what could cause you to save a log factor."

That gives lots of examples of problems whose best known upper bound on space complexity has a log in the denominator. (Depending on your viewpoint, I would think that either makes this example very interesting - what an amazing theorem! - or very uninteresting - it's probably not "actually useful".)

[1] Hopcroft, Paul, and Valiant. On time versus space. J. ACM 24(2):332-337, 1977.

Joshua Grochow
  • 37,260
  • 4
  • 129
  • 228
8

The best known algorithm for computing the edit (a.k.a. Levenshtein) distance between two strings of length $n$ takes $O((n/\log n)^2)$ time:

William J. Masek, Mike Paterson: A Faster Algorithm Computing String Edit Distances. J. Comput. Syst. Sci. 20(1): 18-31 (1980).

Mahdi Cheraghchi
  • 4,031
  • 22
  • 30
7

$\Theta(n/\log n)$ appears as the correct bound for a problem considered by Greg and Paul Valiant (no connection to bit tricks):

Gregory Valiant, and Paul Valiant, The power of linear estimators, 2011. In the 52nd Annual IEEE Symposium on the Foundations of Computer Science, FOCS 2011.

Jelani Nelson
  • 451
  • 6
  • 7
7

Here's another example of a tight bound having a log factor. (This is Theorem 6.17 from Boolean Function Complexity: Advances and Frontiers by Stasys Jukna.)

The formula size (over the full binary basis or the De Morgan basis) of the element distinctness problem is $\Theta(n^2/\log n)$, where $n$ is the number of bits in the input.

The reason the log factor appears in the denominator is that representing $m$ integers between 1 and $\text{poly}(m)$ requires $n := O(m \log m)$ bits in total, since each integer requires $O(\log m)$ bits. So an upper bound that looks natural in terms of $m$, like $\Theta(m^2 \log m)$, becomes $\Theta(n^2/\log n)$ when expressed in terms of $n$, where $n$ is the number of bits in the input.

Robin Kothari
  • 13,617
  • 2
  • 60
  • 116
2

Finding the prime factors of n by trial division when the list of primes is already given. There are $\theta(\frac{n}{\log(n)})$ primes less than n so if these primes are given to you, then trial division of n by each of them takes $\theta(\frac{n}{\log(n)})$ time (assuming division is a constant-time operation)

dspyz
  • 916
  • 4
  • 13
  • 3
    In fact, it's enough to look at the roughly $2\sqrt{n}/\log n$ primes below $\sqrt{n}$. But there are far better algorithms out there. – Yuval Filmus Feb 07 '13 at 05:16
-2

somewhat similar to JG's answer & "thinking outside the box", this seems like a related/relevant/apropos/fundamental negative result. based on diagonalization with a universal TM, there exists a $O(f(n))$ DTIME language that cannot run in $O\left({f(n)\over{\log f(n)}}\right)$ DTIME, due to the time hierarchy theorem. so this applies to a linear DTIME algorithm that exists, $f(n)=n$, that runs impossibly in $O\left(n \over {\log n} \right)$ DTIME.

vzn
  • 11,014
  • 2
  • 31
  • 64
  • 2
    on a TM, $\mathsf{DTIME}(n/\log n)$ is trivial as it doesn't allow the machine to read the whole input. also, the DTIME notation makes the big-oh notation unnecessary. – Sasho Nikolov Feb 12 '13 at 04:23
  • ?? there is still theory for sublinear time algorithms... – vzn Feb 12 '13 at 04:51
  • 3
    sublinear algorithms make sense in oracle & random access models. DTIME is standardly defined w.r.t. multitape TM, and that's the definition used in the hierarchy theorem for DTIME. – Sasho Nikolov Feb 13 '13 at 00:21
  • 1
    No, @SashoNikolov, $\mathsf{DTIME}(n/\log n)$ is not trivial. Compare "Are the first $n/\lg n$ bits of the input all zeros?" with "Do the first $n/\lg n$ bits of the input encode a satisfiable boolean formula?" – Jeffε Feb 13 '13 at 02:06
  • @JɛffE Fair enough, I was lazy to be more careful in my statement. The issue is that a "hierarchy" is trivial once you reach sublinear DTIME. I.e. $\mathsf{DTIME}(f(n))$ for $f(n) = o(n)$ is very easily seen to be a strict subset of $\mathsf{DTIME}(n)$. Take "is the $n$-th bit of the input 1?" – Sasho Nikolov Feb 13 '13 at 02:18
  • hi guys after further study admit/concede its not so simple & something is not quite right with the above idea based on other formulations of the time hierarchy theorem other than wikipedia, which doesnt make it clear theres a lower bound on languages where it is applicable, and the other formulations make the restriction more clear eg here. however, suspect there might exist some language with complexity that approaches (but still exceeds) a linear bound "from above" for which the above idea fits...."exercise for reader" =) – vzn Feb 13 '13 at 03:53
  • 5
    @JɛffE: You cannot test “Are the first $n/\lg n$ bits of the input all zeros?” in $O(n/\log n)$ time on a TM, since you do not know what $n$ is without first reading the whole input, which takes time $n$. It is a standard fact that if $f(n)<n$, then $\mathrm{DTIME}(f(n))$ contains only languages the membership in which can be determined from the first $k$ bits of input for a constant $k$ (and therefore are computable in constant time). – Emil Jeřábek Feb 13 '13 at 14:12
  • fyi wikipedia says that $f(n)$ in the time hierarchy thm need only by time constructible & then gives two defns of time constructible. at the end, "No function which is o(n) can be time-constructible unless it is eventually constant, since there is insufficient time to read the entire input." so maybe this all amts to some undergraduate exercise or test question... heres another idea, what if the length of the input is specified in binary at the beginning of the input? – vzn Feb 13 '13 at 16:28
  • new/fixed/similar idea, let $f(n)=e^{\log n}$ and then a DTIME($f(n)$) language exists that is not recognizable in DTIME$\left( e^{\log n} \over \log n \right)$. – vzn Feb 13 '13 at 17:43
  • and speaking of the time hiearchy thm, this observation is related. define $f(n)$ as in the original question. then there exists a language that can be recognized in $O\left( f(n) \over {\log n} \right)$ but not in $O\left( f(n) \over (\log n)^2 \right)$ for all (time constructible) $f(n)$ such that $f(n) \over (\log n)^2$ is $\omega(n)$. – vzn Feb 14 '13 at 16:30