207

I would like to combine OrderedDict() and defaultdict() from collections in one object, which shall be an ordered, default dict.
Is this possible?

martineau
  • 112,593
  • 23
  • 157
  • 280
jlconlin
  • 13,130
  • 21
  • 67
  • 99
  • 5
    [I wonder why you can't just create a class that inherits `OrderedDict ` and `defaultdict`?](http://stackoverflow.com/q/27712226/1484957a0) – drs Dec 30 '14 at 20:39
  • @drs see my answer below, which does exactly that: http://stackoverflow.com/a/35968897/1644561 – avyfain Mar 13 '16 at 10:00
  • 3
    Even though you've already accepted a solution, you might want to check-out the somewhat simpler `OrderedDefaultdict` class I wrote for this [answer](https://stackoverflow.com/questions/4126348/how-do-i-rewrite-this-function-to-implement-ordereddict/4127426#4127426). – martineau May 31 '11 at 21:39
  • 1
    I understand that from Python 3.7 onwards the insertion order is maintained for anything that inherits from the regular `dict` - that includes the `defaultdict`. – Peter Kilczuk May 03 '20 at 21:11

9 Answers9

94

The following (using a modified version of this recipe) works for me:

from collections import OrderedDict, Callable

class DefaultOrderedDict(OrderedDict):
    # Source: http://stackoverflow.com/a/6190500/562769
    def __init__(self, default_factory=None, *a, **kw):
        if (default_factory is not None and
           not isinstance(default_factory, Callable)):
            raise TypeError('first argument must be callable')
        OrderedDict.__init__(self, *a, **kw)
        self.default_factory = default_factory

    def __getitem__(self, key):
        try:
            return OrderedDict.__getitem__(self, key)
        except KeyError:
            return self.__missing__(key)

    def __missing__(self, key):
        if self.default_factory is None:
            raise KeyError(key)
        self[key] = value = self.default_factory()
        return value

    def __reduce__(self):
        if self.default_factory is None:
            args = tuple()
        else:
            args = self.default_factory,
        return type(self), args, None, None, self.items()

    def copy(self):
        return self.__copy__()

    def __copy__(self):
        return type(self)(self.default_factory, self)

    def __deepcopy__(self, memo):
        import copy
        return type(self)(self.default_factory,
                          copy.deepcopy(self.items()))

    def __repr__(self):
        return 'OrderedDefaultDict(%s, %s)' % (self.default_factory,
                                               OrderedDict.__repr__(self))
Martin Thoma
  • 108,021
  • 142
  • 552
  • 849
zeekay
  • 49,858
  • 13
  • 107
  • 105
  • 2
    Deleted my answer, which was similar in thought process but designed on the fly (and hence needed to implement various other functions). – dr jimbob May 31 '11 at 16:23
  • 3
    @Neil G: You probably should just use the built-in `callable()` function to test `default_factory`. Using `isinstance(default_factory, Callable)` actually requires it to have more than just callability -- see the [docs](http://docs.python.org/library/collections.html?highlight=callable#collections.Callable) -- which is all that's is needed here. – martineau Jun 17 '12 at 17:29
  • @martineau: You're right. I believe `callable` was removed in Python 3.1 and then reinstated in Python 3.2, and I hadn't upgraded yet when I made this edit. Feel free to make the change. – Neil G Jun 17 '12 at 17:45
  • 1
    @Neil G: Actually `callable()` was first removed in Python 3.0 and then brought back in Python 3.2. Anyway, consider changing it yourself if you wish (I like my own answer better anyway ;-). I generally tend to shy away from just hopping in and changing someone else's answer, preferring instead to only make comments as I've done here. – martineau Jun 17 '12 at 18:41
  • 4
    @zeekay: I think you might need to change `self.items()` into `iter(self.items())` inside `__reduce__`. Otherwise, `PicklingError` exception is raised complaining that fifth argument of the `__reduce__` must be an iterator. – max Jul 30 '12 at 21:03
  • @max: Instead of changing it to `iter(self.items())` it would be simpler to use `self.iteritems()`. Also, the type of error you get from not doing either depends on whether you're using the `pickle` or `cPickle` module. With `pickle`, the error message is `AttributeError: 'list' object has no attribute 'next'` for the same reason (because the list object passed isn't an iterator). FWIW, the first element of the returned tuple could also be changed from `type(self)` to a simpler `self.__class__`. – martineau Oct 22 '13 at 19:18
  • 1
    When I `copy.deepcopy()` an instance of this object, I get a maximum recursion depth exception. In `DefaultOrderedDict.__deepcopy__`, my quick fix is to change the argument `copy.deepcopy(self.items())` to `copy.deepcopy(tuple(self.items())`. – chfoo Aug 18 '14 at 01:29
46

Here is another possibility, inspired by Raymond Hettinger's super() Considered Super, tested on Python 2.7.X and 3.4.X:

from collections import OrderedDict, defaultdict

class OrderedDefaultDict(OrderedDict, defaultdict):
    def __init__(self, default_factory=None, *args, **kwargs):
        #in python3 you can omit the args to super
        super(OrderedDefaultDict, self).__init__(*args, **kwargs)
        self.default_factory = default_factory

If you check out the class's MRO (aka, help(OrderedDefaultDict)), you'll see this:

class OrderedDefaultDict(collections.OrderedDict, collections.defaultdict)
 |  Method resolution order:
 |      OrderedDefaultDict
 |      collections.OrderedDict
 |      collections.defaultdict
 |      __builtin__.dict
 |      __builtin__.object

meaning that when an instance of OrderedDefaultDict is initialized, it defers to the OrderedDict's init, but this one in turn will call the defaultdict's methods before calling __builtin__.dict, which is precisely what we want.

avyfain
  • 804
  • 10
  • 17
  • 21
    This answer, despite its elegance and simplicity, doesn't work in Python3. Since both OrderedDict and defaultdict are implemented in C, you get a TypeError, "multiple bases have instance lay-out conflict." That's because the C classes have differing, and incompatible, ideas of how to lay out the internal data structures. The accepted answer above works well in Python3, with a few tiny changes (super().__getitem__(... instead of OrderedDict.__getitem_(... ). I'm using Python3.5. – ivanlan May 19 '16 at 16:10
  • 3
    Interesting, this works correctly in Python 3.4.3 Is there any way to see where the TypeError is coming from in the C code? – avyfain May 25 '16 at 20:12
  • 3
    so beautiful. a shame it doesn't work in Python 3. – noɥʇʎԀʎzɐɹƆ Aug 06 '16 at 18:15
  • 12
    As of Python 3.6 this will be unnecessary, as all `dicts`, and therefore all `defaultdicts`, will be ordered. I am ok with it not working on 3.5 ;) – avyfain Dec 21 '16 at 01:13
  • 17
    Though `dicts` in CPython 3.6 preserve order, it is an implementation detail not to be relied upon, see http://stackoverflow.com/a/39980548/91243. Use `OrderedDict` if that is what you want. – amjoconn Feb 15 '17 at 19:07
  • It's an implementation detail, but if it’s there to support an official feature of the language now in the spec: ordered `**kwargs`. The quote mentioned in @amjoconn’s comment deserves to be more complete, [read release notes on 3.6 in full](https://docs.python.org/3/whatsnew/3.6.html#new-dict-implementation). Note how frustratingly dissembling the docs are: per _letter_ this “shouldn’t be relied upon”, then in parens it “may” be codified in the future, and if you ask me the _spirit_ of it is “it’s an awesome feature and it’s already in PyPy, just let’s wait before adding this to the spec”. – Anton Strogonoff Jun 02 '17 at 03:46
  • From videos, the consensus from PyCon seems to be moving toward making the ordering standard: https://www.youtube.com/watch?v=66P5FMkWoVU. Nothing official yet. – amjoconn Jul 18 '17 at 14:42
  • 12
    It's now offical Guido approved it. – Fruch Dec 20 '17 at 21:28
32

If you want a simple solution that doesn't require a class, you can just use OrderedDict.setdefault(key, default=None) or OrderedDict.get(key, default=None). If you only get / set from a few places, say in a loop, you can easily just setdefault.

totals = collections.OrderedDict()

for i, x in some_generator():
    totals[i] = totals.get(i, 0) + x

It is even easier for lists with setdefault:

agglomerate = collections.OrderedDict()

for i, x in some_generator():
    agglomerate.setdefault(i, []).append(x)

But if you use it more than a few times, it is probably better to set up a class, like in the other answers.

Artyer
  • 20,910
  • 3
  • 38
  • 60
29

Here's another solution to think about if your use case is simple like mine and you don't necessarily want to add the complexity of a DefaultOrderedDict class implementation to your code.

from collections import OrderedDict

keys = ['a', 'b', 'c']
items = [(key, None) for key in keys]
od = OrderedDict(items)

(None is my desired default value.)

Note that this solution won't work if one of your requirements is to dynamically insert new keys with the default value. A tradeoff of simplicity.

Update 3/13/17 - I learned of a convenience function for this use case. Same as above but you can omit the line items = ... and just:

od = OrderedDict.fromkeys(keys)

Output:

OrderedDict([('a', None), ('b', None), ('c', None)])

And if your keys are single characters, you can just pass one string:

OrderedDict.fromkeys('abc')

This has the same output as the two examples above.

You can also pass a default value as the second arg to OrderedDict.fromkeys(...).

Taylor D. Edmiston
  • 10,549
  • 5
  • 49
  • 72
  • 2
    Thank you! `od = OrderedDict((k, None) for k in iterable)` – n8henrie Jun 03 '16 at 15:53
  • 1
    This assumes your keys are predefined in some iterable though, so downstream objects would need to be aware that adding a new key requires an initial value. To be more precise, you couldn't assume an initial value for something like: `>>> od = OrderedDefaultDict(int) >>> od['foo'] += 100 OrderedDefaultDict([('foo', 100)])` This case would be correctly handled by a solution like [this one](http://stackoverflow.com/a/35968897/1644561). – avyfain Oct 13 '16 at 22:42
  • @avyfain That's correct. For my use case, it was just the initial data so future inserts of keys not previously defined wasn't relevant. I'll add a note to make the assumption explicit. – Taylor D. Edmiston Oct 13 '16 at 22:45
10

Another simple approach would be to use dictionary get method

>>> from collections import OrderedDict
>>> d = OrderedDict()
>>> d['key'] = d.get('key', 0) + 1
>>> d['key'] = d.get('key', 0) + 1
>>> d
OrderedDict([('key', 2)])
>>> 
snebel29
  • 169
  • 2
  • 5
7

A simpler version of @zeekay 's answer is:

from collections import OrderedDict

class OrderedDefaultListDict(OrderedDict): #name according to default
    def __missing__(self, key):
        self[key] = value = [] #change to whatever default you want
        return value
Neck Beard
  • 333
  • 5
  • 8
7

A simple and elegant solution building on @NickBread. Has a slightly different API to set the factory, but good defaults are always nice to have.

class OrderedDefaultDict(OrderedDict):
    factory = list

    def __missing__(self, key):
        self[key] = value = self.factory()
        return value
F Pereira
  • 967
  • 9
  • 8
-2

Inspired by other answers on this thread, you can use something like,

from collections import OrderedDict

class OrderedDefaultDict(OrderedDict):
    def __missing__(self, key):
        value = OrderedDefaultDict()
        self[key] = value
        return value

I would like to know if there're any downsides of initializing another object of the same class in the missing method.

Samarth
  • 109
  • 1
  • 1
  • 4
  • 2
    This is an ordered dict where the default value is always another ordered dict. Not really what the question was about. – Ivan Ivanov Jan 16 '19 at 13:38
-3

i tested the default dict and discovered it's also sorted! maybe it was just a coincidence but anyway you can use the sorted function:

sorted(s.items())

i think it's simpler

Ortal Turgeman
  • 133
  • 3
  • 12