279

Assuming connectionDetails is a Python dictionary, what's the best, most elegant, most "pythonic" way of refactoring code like this?

if "host" in connectionDetails:
    host = connectionDetails["host"]
else:
    host = someDefaultValue
martineau
  • 112,593
  • 23
  • 157
  • 280
mnowotka
  • 15,824
  • 17
  • 82
  • 129

8 Answers8

400

Like this:

host = connectionDetails.get('host', someDefaultValue)
Solomon Ucko
  • 3,994
  • 2
  • 20
  • 39
MattH
  • 35,418
  • 10
  • 81
  • 84
  • 56
    Note that the second argument is a value, not a key. – Marcin Feb 20 '12 at 09:49
  • 7
    +1 for readability, but `if/else` is much faster. That might or might not play a role. – Tim Pietzcker Jul 06 '13 at 09:20
  • 11
    @Tim, Can you provide a reference as to why `if/else` is faster? – nishantjr Oct 27 '14 at 05:58
  • @nishantjr: Did you see my answer below? – Tim Pietzcker Oct 27 '14 at 07:14
  • Just did. I find this very interesting. Can't understand why they would be different - I'd expect get to be more optimized. Thanks – nishantjr Oct 27 '14 at 07:21
  • @nishantjr: The method call itself is quite expensive, and it has to handle the default argument. – Tim Pietzcker Oct 27 '14 at 07:24
  • 4
    @Tim: I had assumed that one of the advantages of using a higher level language is that the interpreter would be able to 'see' inside the functions and optimize it - that the user wouldn't have to deal with micro-optimizations as much. Isn't that what things like JIT compilation are for? – nishantjr Oct 27 '14 at 07:32
  • 4
    @nishantjr: Python (at least CPython, the most common variant) does'nt have JIT compilation. PyPy might indeed solve this faster, but I haven't got that installed since standard Python has always been fast enough for my purposes so far. In general, it's unlikely to matter in real life - if you need to do time-critical number crunching, Python probably is not the language of choice... – Tim Pietzcker Oct 27 '14 at 07:42
  • 1
    @Tim: Thanks for taking the time to explain. – nishantjr Oct 31 '14 at 10:06
  • 1
    The if/else example is looking up the key in the dictionary twice, while the default example might only be doing one lookup. Besides dictionary lookups potentially being costly in more extreme cases (where you probably shouldn't use Python to begin with), dictionary lookups are function calls too. But I am seeing that the if/else takes about 1/3 less time with my test using string keys and int values in Python 3.4, and I'm not sure why. – sudo Mar 11 '17 at 21:22
132

You can also use the defaultdict like so:

from collections import defaultdict
a = defaultdict(lambda: "default", key="some_value")
a["blabla"] => "default"
a["key"] => "some_value"

You can pass any ordinary function instead of lambda:

from collections import defaultdict
def a():
  return 4

b = defaultdict(a, key="some_value")
b['absent'] => 4
b['key'] => "some_value"
xpmatteo
  • 10,868
  • 3
  • 23
  • 24
tamerlaha
  • 1,738
  • 1
  • 16
  • 21
  • 11
    I came here for some different problem than the OP's question, and your solution exactly solves it. – 0xc0de Dec 17 '15 at 06:27
  • I would +1 it but sadly it doesn't fit in with `get` or similar methods. – 0xc0de Dec 17 '15 at 09:20
  • This answer was useful to me for ensuring additions to a dictionary included default keys. My implementation is a little too long to describe in a StackOverflow answer, so I wrote about it here. https://persagen.com/2020/03/05/python_dictionaries_default_values_immutable_keys.html – Victoria Stuart Mar 06 '20 at 03:11
32

While .get() is a nice idiom, it's slower than if/else (and slower than try/except if presence of the key in the dictionary can be expected most of the time):

>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="try:\n a=d[1]\nexcept KeyError:\n a=10")
0.07691968797894333
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="try:\n a=d[2]\nexcept KeyError:\n a=10")
0.4583777282275605
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="a=d.get(1, 10)")
0.17784020746671558
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="a=d.get(2, 10)")
0.17952161730158878
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="if 1 in d:\n a=d[1]\nelse:\n a=10")
0.10071221458065338
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="if 2 in d:\n a=d[2]\nelse:\n a=10")
0.06966537335119938
Tim Pietzcker
  • 313,408
  • 56
  • 485
  • 544
  • 4
    I still don't see _why_ `if/then` would be faster. Both cases require a dictionary lookup, and unless the invocation of `get()` is _so_ much slower, what else accounts for the slowdown? – Jens Mar 13 '15 at 21:33
  • 2
    @Jens: Function calls are expensive. – Tim Pietzcker Mar 13 '15 at 21:35
  • 4
    Which shouldn't be a big deal in a heavily populated dictionary, correct? Meaning the function call is not going to matter much if the actual lookup is costly. It probably only matters in toy examples. – AturSams May 14 '15 at 10:56
  • 2
    @zehelvion: Dictionary lookup is `O(1)` regardless of dictionary size, so the function call overhead is relevant. – Tim Pietzcker May 14 '15 at 11:42
  • 1
    @TimPietzcker Isn't O(1) an idealization that assumes no collisions? Is this a safe assumption for large dictionaries? – irh Jul 05 '15 at 17:59
  • 1
    @irh: Yes, it assumes no collisions. The [amortized worst case](https://wiki.python.org/moin/TimeComplexity) is `O(n)`, but in practice it's rather difficult to construct a dictionary with hash collisions. – Tim Pietzcker Jul 05 '15 at 18:21
  • 46
    it is bizarre if the overhead of calling a function would make you decide against using get. Use what your fellow team members can read best. – Jochen Bedersdorfer Apr 04 '16 at 20:51
  • 2
    Great analysis Tim, it looks like ternary if versions (`a=d[1] if 1 in d else 10` and `a=d[2] if 2 in d else 10`) have the same performance characteristics as the traditional if statement, with Python 3.5.1 at least. – Mark Booth Apr 26 '16 at 21:17
  • 1
    You can improve the speed of the `.get` method a little by caching it, but `.get` is also slow because it catches the `KeyError`; I presume it does that at the C level, but it's still slower than `if...else` if the `KeyError` is likely to be raised more than 10% of the time. See [here](http://stackoverflow.com/a/35451912/4014959) for some `timeit` comparisons between `in` and `.get`. – PM 2Ring Jun 21 '18 at 07:57
  • 1
    @JochenBedersdorfer "premature optimisation" syndrome :) It is good to know the relative performance of different forms to put in for when profiling shows an issue, along with a comment as to why get() isn't used so someone doesn't go replacing with a get() in the future. But otherwise, the easy to understand form should absolutely trump other considerations. – Nick May 04 '20 at 11:24
21

For multiple different defaults try this:

connectionDetails = { "host": "www.example.com" }
defaults = { "host": "127.0.0.1", "port": 8080 }

completeDetails = {}
completeDetails.update(defaults)
completeDetails.update(connectionDetails)
completeDetails["host"]  # ==> "www.example.com"
completeDetails["port"]  # ==> 8080
Jerome Baum
  • 736
  • 6
  • 15
  • 3
    This is a good idiomatic solution, but there is a pitfall. Unexpected outcomes may result if connectionDetails is supplied with `None` or the emptyString as one of the values in the key-value pairs. The `defaults` dictionary could potentially have one of its values unintentionally blanked out. (see also https://stackoverflow.com/questions/6354436) – dreftymac May 29 '17 at 18:05
14

This is not exactly the question asked for but there is a method in python dictionaries: dict.setdefault

    host = connectionDetails.setdefault('host',someDefaultValue)

However this method sets the value of connectionDetails['host'] to someDefaultValue if key host is not already defined, unlike what the question asked.

Sriram
  • 345
  • 3
  • 9
  • 4
    Note that `setdefault()` returns value, so this works as well: `host = connectionDetails.setdefault('host', someDefaultValue)`. Just beware that it will set `connectionDetails['host']` to default value if the key wasn't there before. – ash108 Oct 16 '16 at 18:41
9

(this is a late answer)

An alternative is to subclass the dict class and implement the __missing__() method, like this:

class ConnectionDetails(dict):
    def __missing__(self, key):
        if key == 'host':
            return "localhost"
        raise KeyError(key)

Examples:

>>> connection_details = ConnectionDetails(port=80)

>>> connection_details['host']
'localhost'

>>> connection_details['port']
80

>>> connection_details['password']
Traceback (most recent call last):
  File "python", line 1, in <module>
  File "python", line 6, in __missing__
KeyError: 'password'
Laurent LAPORTE
  • 20,141
  • 5
  • 53
  • 92
5

Testing @Tim Pietzcker's suspicion about the situation in PyPy (5.2.0-alpha0) for Python 3.3.5, I find that indeed both .get() and the if/else way perform similar. Actually it seems that in the if/else case there is even only a single lookup if the condition and the assignment involve the same key (compare with the last case where there is two lookups).

>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="try:\n a=d[1]\nexcept KeyError:\n a=10")
0.011889292989508249
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="try:\n a=d[2]\nexcept KeyError:\n a=10")
0.07310474599944428
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="a=d.get(1, 10)")
0.010391917996457778
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="a=d.get(2, 10)")
0.009348208011942916
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 1 in d:\n a=d[1]\nelse:\n a=10")
0.011475925013655797
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 2 in d:\n a=d[2]\nelse:\n a=10")
0.009605801998986863
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 2 in d:\n a=d[2]\nelse:\n a=d[1]")
0.017342638995614834
Massimiliano Kraus
  • 3,433
  • 5
  • 24
  • 45
Till
  • 51
  • 1
  • 2
2

You can use a lamba function for this as a one-liner. Make a new object connectionDetails2 which is accessed like a function...

connectionDetails2 = lambda k: connectionDetails[k] if k in connectionDetails.keys() else "DEFAULT"

Now use

connectionDetails2(k)

instead of

connectionDetails[k]

which returns the dictionary value if k is in the keys, otherwise it returns "DEFAULT"

CasualScience
  • 501
  • 1
  • 6
  • 18