10

Does there exist a w-bit word-RAM data structure with O(1) time per operation for the following problem?: Maintain a set of w-bit non-negative integers that supports the operations

  • add(x) : add x to the set
  • remove(x) : remove x from the set
  • fingerprint() : return a fingerprint of the set. This w-bit fingerprint has the property that two sets that are identical have the same fingerprint while two sets that are different probably have different fingerprints

All operations should run in constant time.

The Rabin-Karp fingerprinting scheme, where $$f(S) = \left(\sum_{x\in S} 2^x\right) \bmod p$$ where p is a random w-bit prime almost works. The problem is with the update time, since computing $$2^x \bmod p$$ takes more than constant time. Using repeated squaring, this can be done in O(log w) time. The fastest modular exponentiation algorithm I could find does something like (log w)/(loglog w) arithmetic operations.

Pat Morin
  • 571
  • 3
  • 10

1 Answers1

1

This answer is a bit hand-wavey.

Select a function $h$ uniformly at random from the set of all functions from w-bit words to w-bit words. For each set, maintain a w-bit fingerprint as follows:

  1. The empty set has the fingerprint 0.
  2. add(x) and remove(x) update a fingerprint $f$ to $f \oplus h(x)$, where $\oplus$ is xor.

Let $S \neq T$ be two sets of w-bit integers. If their fingerprints are the same, then the fingerprint of $S \triangle T$, the symmetric difference of $S$ and $T$, is 0, which happens with probability $2^{-w}$.

jbapple
  • 11,184
  • 2
  • 29
  • 50
  • And in practice you could instantiate $h$ with a pseudorandom function (PRF) from cryptography -- e.g., something based on AES. Should be very efficient, and you get strong promises that it will work (unless the crypto is broken). – D.W. Dec 27 '14 at 02:32