110

My attempt to programmatically create a dictionary of lists is failing to allow me to individually address dictionary keys. Whenever I create the dictionary of lists and try to append to one key, all of them are updated. Here's a very simple test case:

data = {}
data = data.fromkeys(range(2),[])
data[1].append('hello')
print data

Actual result: {0: ['hello'], 1: ['hello']}

Expected result: {0: [], 1: ['hello']}

Here's what works

data = {0:[],1:[]}
data[1].append('hello')
print data

Actual and Expected Result: {0: [], 1: ['hello']}

Why is the fromkeys method not working as expected?

Martijn Pieters
  • 963,270
  • 265
  • 3,804
  • 3,187
Martin Burch
  • 2,486
  • 3
  • 26
  • 55

7 Answers7

133

Passing [] as second argument to dict.fromkeys() gives a rather useless result – all values in the dictionary will be the same list object.

In Python 2.7 or above, you can use a dicitonary comprehension instead:

data = {k: [] for k in range(2)}

In earlier versions of Python, you can use

data = dict((k, []) for k in range(2))
Sven Marnach
  • 530,615
  • 113
  • 910
  • 808
  • 3
    Well this is rather unintuitive behavior, any idea on why the same object is used for all keys? – Bar Apr 13 '16 at 19:05
  • 5
    @Bar Because there is nothing else the function could do within the semantics of the Python language. You pass in a single object to be used as value for all keys, so that single object is used for all keys. It would be better for the `fromkeys()` method to accept a factory function instead, so we could pass in `list` as a function, and that funciton would be called once for each key created, but that's not the actual API of `dict.fromkeys()`. – Sven Marnach Apr 13 '16 at 20:30
  • 3
    This is not intuitive at all. This took me an hour to find. Thanks – Astrid Mar 12 '19 at 13:38
  • 1
    Same thing happens if you pass dict() as a second argument. Very baffling behavior. – Orly May 20 '20 at 12:14
  • @Orly This is because first one empty dictionary is created, and then a reference to it is passed to all initializations. – Dr_Zaszuś May 28 '20 at 08:01
  • is there a way to do this if you wish to use some strings as the keys? for example `my_dict = {'string1 :[], string2 :[]} – KevOMalley743 Jul 29 '21 at 09:10
  • 2
    @KevOMalley743 `{"string1": [], "string2": []}` looks like perfectly fine Python code, so I don't quite understand what problem you are asking about. – Sven Marnach Jul 29 '21 at 11:27
  • @SvenMarnach you're right, but I was asking in the case where the number of lists required is long enough to make it handy to call them from a list or some other structure. – KevOMalley743 Jul 29 '21 at 12:12
  • 1
    @KevOMalley743 Are you looking for something like `keys = [...]; my_dict = {k: [] for k in keys}`? – Sven Marnach Jul 30 '21 at 08:08
111

Use defaultdict instead:

from collections import defaultdict
data = defaultdict(list)
data[1].append('hello')

This way you don't have to initialize all the keys you want to use to lists beforehand.

What is happening in your example is that you use one (mutable) list:

alist = [1]
data = dict.fromkeys(range(2), alist)
alist.append(2)
print data

would output {0: [1, 2], 1: [1, 2]}.

Martijn Pieters
  • 963,270
  • 265
  • 3,804
  • 3,187
  • 2
    In my case, I need to initialize all the keys beforehand so the rest of the program logic can work as expected, but this would be a good solution otherwise. Thanks. – Martin Burch Jul 16 '12 at 18:06
  • 1
    I guess what is missing from this answer is saying that this solution works, as opposed to that of the OP, because `list` here is not an empty list, but a type (or you can see it as a callable constructor, I guess). So every time a missing key is passed, a new list is created instead of re-using the same one. – Dr_Zaszuś May 28 '20 at 08:09
45

You could use a dict comprehension:

>>> keys = ['a','b','c']
>>> value = [0, 0]
>>> {key: list(value) for key in keys}
    {'a': [0, 0], 'b': [0, 0], 'c': [0, 0]}
Blender
  • 275,078
  • 51
  • 420
  • 480
  • 2
    `list(value)` is the same thing as `value[:]` here? – yurisich Apr 20 '14 at 01:56
  • 1
    @Droogans: Yep. I just find the empty slice notation ugly. – Blender Apr 20 '14 at 02:51
  • `value[:]` isn't _that_ ugly (unless you share Alex Martelli's aesthetic sense :) ), and it's less typing. In recent versions of Python there's now a `list.copy` method. In terms of performance, slicing is fastest for small lists (upto 50 or 60 items), but for larger lists `list(value)` is actually a little faster. `value.copy()` seems to have similar performance to `list(value)`. All 3 techniques dramatically slow down for large lists: on my old 32 bit machine that happens around 32k, YMMV depending on your CPU's word size and cache sizes. – PM 2Ring Jul 14 '18 at 08:25
40

This answer is here to explain this behavior to anyone flummoxed by the results they get of trying to instantiate a dict with fromkeys() with a mutable default value in that dict.

Consider:

#Python 3.4.3 (default, Nov 17 2016, 01:08:31) 

# start by validating that different variables pointing to an
# empty mutable are indeed different references.
>>> l1 = []
>>> l2 = []
>>> id(l1)
140150323815176
>>> id(l2)
140150324024968

so any change to l1 will not affect l2 and vice versa. this would be true for any mutable so far, including a dict.

# create a new dict from an iterable of keys
>>> dict1 = dict.fromkeys(['a', 'b', 'c'], [])
>>> dict1
{'c': [], 'b': [], 'a': []}

this can be a handy function. here we are assigning to each key a default value which also happens to be an empty list.

# the dict has its own id.
>>> id(dict1)
140150327601160

# but look at the ids of the values.
>>> id(dict1['a'])
140150323816328
>>> id(dict1['b'])
140150323816328
>>> id(dict1['c'])
140150323816328

Indeed they are all using the same ref! A change to one is a change to all, since they are in fact the same object!

>>> dict1['a'].append('apples')
>>> dict1
{'c': ['apples'], 'b': ['apples'], 'a': ['apples']}
>>> id(dict1['a'])
>>> 140150323816328
>>> id(dict1['b'])
140150323816328
>>> id(dict1['c'])
140150323816328

for many, this was not what was intended!

Now let's try it with making an explicit copy of the list being used as a the default value.

>>> empty_list = []
>>> id(empty_list)
140150324169864

and now create a dict with a copy of empty_list.

>>> dict2 = dict.fromkeys(['a', 'b', 'c'], empty_list[:])
>>> id(dict2)
140150323831432
>>> id(dict2['a'])
140150327184328
>>> id(dict2['b'])
140150327184328
>>> id(dict2['c'])
140150327184328
>>> dict2['a'].append('apples')
>>> dict2
{'c': ['apples'], 'b': ['apples'], 'a': ['apples']}

Still no joy! I hear someone shout, it's because I used an empty list!

>>> not_empty_list = [0]
>>> dict3 = dict.fromkeys(['a', 'b', 'c'], not_empty_list[:])
>>> dict3
{'c': [0], 'b': [0], 'a': [0]}
>>> dict3['a'].append('apples')
>>> dict3
{'c': [0, 'apples'], 'b': [0, 'apples'], 'a': [0, 'apples']}

The default behavior of fromkeys() is to assign None to the value.

>>> dict4 = dict.fromkeys(['a', 'b', 'c'])
>>> dict4
{'c': None, 'b': None, 'a': None}
>>> id(dict4['a'])
9901984
>>> id(dict4['b'])
9901984
>>> id(dict4['c'])
9901984

Indeed, all of the values are the same (and the only!) None. Now, let's iterate, in one of a myriad number of ways, through the dict and change the value.

>>> for k, _ in dict4.items():
...    dict4[k] = []

>>> dict4
{'c': [], 'b': [], 'a': []}

Hmm. Looks the same as before!

>>> id(dict4['a'])
140150318876488
>>> id(dict4['b'])
140150324122824
>>> id(dict4['c'])
140150294277576
>>> dict4['a'].append('apples')
>>> dict4
>>> {'c': [], 'b': [], 'a': ['apples']}

But they are indeed different []s, which was in this case the intended result.

Shawn Mehan
  • 4,433
  • 9
  • 29
  • 50
11

You can use this:

l = ['a', 'b', 'c']
d = dict((k, [0, 0]) for k in l)
g.d.d.c
  • 44,141
  • 8
  • 97
  • 109
9

You are populating your dictionaries with references to a single list so when you update it, the update is reflected across all the references. Try a dictionary comprehension instead. See Create a dictionary with list comprehension in Python

d = {k : v for k in blah blah blah}
Community
  • 1
  • 1
cobie
  • 6,527
  • 9
  • 35
  • 58
  • great suggestion on initializing dictionary values... thanks cobie! I extended your example to reset the values in an existing dictionary, d. I performed this as follows: d = { k:0 for k in d } – John Aug 21 '16 at 14:29
  • What is `v` in this answer? – Dr_Zaszuś May 28 '20 at 08:03
-3

You could use this:

data[:1] = ['hello']
Jon B
  • 49,709
  • 30
  • 129
  • 160
Conner Dassen
  • 671
  • 3
  • 9
  • 26