180

In python, is there a difference between calling clear() and assigning {} to a dictionary? If yes, what is it? Example:

d = {"stuff":"things"}
d.clear()   #this way
d = {}      #vs this way
Michael Hoffman
  • 30,155
  • 7
  • 58
  • 82
Marcin
  • 11,959
  • 9
  • 41
  • 49

8 Answers8

304

If you have another variable also referring to the same dictionary, there is a big difference:

>>> d = {"stuff": "things"}
>>> d2 = d
>>> d = {}
>>> d2
{'stuff': 'things'}
>>> d = {"stuff": "things"}
>>> d2 = d
>>> d.clear()
>>> d2
{}

This is because assigning d = {} creates a new, empty dictionary and assigns it to the d variable. This leaves d2 pointing at the old dictionary with items still in it. However, d.clear() clears the same dictionary that d and d2 both point at.

Greg Hewgill
  • 890,778
  • 177
  • 1,125
  • 1,260
  • 9
    Thanks. This makes sense. I still have to get used to the mindset that = creates references in python... – Marcin Dec 15 '08 at 22:33
  • 16
    = copies references to names. There are no variables in python, only objects and names. – tzot Dec 16 '08 at 01:43
  • 17
    While your "no variables" statement is pedantically true, it's not really helpful here. As long as the Python language documentation still talks about "variables", I'm still going to use the term: http://docs.python.org/reference/datamodel.html – Greg Hewgill Dec 16 '08 at 09:34
  • 9
    I found tzot's comment helpful in adjusting my thinking about names, variables, and types of copies. Calling it pedantic may be your opinion, but I find it to be an unfairly harsh judgement. – cfwschmidt Jul 07 '14 at 20:16
  • 2
    Also clear() do not destroy the removed object in the dict which may still be referenced by someone else. – Lorenzo Belli Jan 26 '17 at 10:09
  • 1
    @LorenzoBelli A deep clear is imo overly destructive in most situations, but a simple recursive function should do the job in that scenario. – wizzwizz4 Apr 14 '17 at 16:37
32

d = {} will create a new instance for d but all other references will still point to the old contents. d.clear() will reset the contents, but all references to the same instance will still be correct.

vaultah
  • 40,483
  • 12
  • 109
  • 137
Michel
  • 1,426
  • 11
  • 15
22

In addition to the differences mentioned in other answers, there also is a speed difference. d = {} is over twice as fast:

python -m timeit -s "d = {}" "for i in xrange(500000): d.clear()"
10 loops, best of 3: 127 msec per loop

python -m timeit -s "d = {}" "for i in xrange(500000): d = {}"
10 loops, best of 3: 53.6 msec per loop
odano
  • 231
  • 1
  • 2
  • 10
    This isn't really a valid speed test for all cases since the dict is empty. I think making a large dict (or at least some content) would yield a much smaller performance difference...plus I suspect the garbage collector might add a little of its own hurt to d = {} (?) – Rafe Jul 03 '13 at 16:43
  • 5
    @Rafe : I think the point is if we know that no other variable is pointing to dictionary d, then setting `d = {}` should be faster as cleaning up whole can be left to Garbage Collector for later. – ViFI Nov 23 '16 at 22:37
8

As an illustration for the things already mentioned before:

>>> a = {1:2}
>>> id(a)
3073677212L
>>> a.clear()
>>> id(a)
3073677212L
>>> a = {}
>>> id(a)
3073675716L
maxp
  • 5,214
  • 6
  • 27
  • 29
7

In addition to @odano 's answer, it seems using d.clear() is faster if you would like to clear the dict for many times.

import timeit

p1 = ''' 
d = {}
for i in xrange(1000):
    d[i] = i * i
for j in xrange(100):
    d = {}
    for i in xrange(1000):
        d[i] = i * i
'''

p2 = ''' 
d = {}
for i in xrange(1000):
    d[i] = i * i
for j in xrange(100):
    d.clear()
    for i in xrange(1000):
        d[i] = i * i
'''

print timeit.timeit(p1, number=1000)
print timeit.timeit(p2, number=1000)

The result is:

20.0367929935
19.6444659233
lastland
  • 850
  • 3
  • 13
  • 28
7

Mutating methods are always useful if the original object is not in scope:

def fun(d):
    d.clear()
    d["b"] = 2

d={"a": 2}
fun(d)
d          # {'b': 2}

Re-assigning the dictionary would create a new object and wouldn't modify the original one.

Karoly Horvath
  • 91,854
  • 11
  • 113
  • 173
4

One thing not mentioned is scoping issues. Not a great example, but here's the case where I ran into the problem:

def conf_decorator(dec):
    """Enables behavior like this:
        @threaded
        def f(): ...

        or

        @threaded(thread=KThread)
        def f(): ...

        (assuming threaded is wrapped with this function.)
        Sends any accumulated kwargs to threaded.
        """
    c_kwargs = {}
    @wraps(dec)
    def wrapped(f=None, **kwargs):
        if f:
            r = dec(f, **c_kwargs)
            c_kwargs = {}
            return r
        else:
            c_kwargs.update(kwargs) #<- UnboundLocalError: local variable 'c_kwargs' referenced before assignment
            return wrapped
    return wrapped

The solution is to replace c_kwargs = {} with c_kwargs.clear()

If someone thinks up a more practical example, feel free to edit this post.

Ponkadoodle
  • 5,633
  • 5
  • 34
  • 62
  • `global c_kwargs` would probably also work no? Although probably `global` isn't the best thing to be using a lot of. – fantabolous Jul 04 '14 at 05:45
  • 3
    @fantabolous using `global` would make the function behave differently - all calls to conf_decorator would then share the same c_kwargs variable. I believe Python 3 added the `nonlocal` keyword to address this issue, and that would work. – Ponkadoodle Jul 04 '14 at 05:54
4

In addition, sometimes the dict instance might be a subclass of dict (defaultdict for example). In that case, using clear is preferred, as we don't have to remember the exact type of the dict, and also avoid duplicate code (coupling the clearing line with the initialization line).

x = defaultdict(list)
x[1].append(2)
...
x.clear() # instead of the longer x = defaultdict(list)
Tzach
  • 12,240
  • 11
  • 63
  • 104