25

Given two lists of dictionaries:

>>> lst1 = [{id: 1, x: "one"},{id: 2, x: "two"}]
>>> lst2 = [{id: 2, x: "two"}, {id: 3, x: "three"}]
>>> merge_lists_of_dicts(lst1, lst2) #merge two lists of dictionary items by the "id" key
[{id: 1, x: "one"}, {id: 2, x: "two"}, {id: 3, x: "three"}]

Any way to implement merge_lists_of_dicts what merges two lists of dictionary based on the dictionary items' keys?

xiaohan2012
  • 8,986
  • 20
  • 64
  • 99

4 Answers4

17

Perhaps the simplest option

result = {x['id']:x for x in lst1 + lst2}.values()

This keeps only unique ids in the list, not preserving the order though.

If the lists are really big, a more realistic solution would be to sort them by id and merge iteratively.

georg
  • 204,715
  • 48
  • 286
  • 369
16
lst1 = [{"id": 1, "x": "one"}, {"id": 2, "x": "two"}]
lst2 = [{"id": 2, "x": "two"}, {"id": 3, "x": "three"}]

result = []
lst1.extend(lst2)
for myDict in lst1:
    if myDict not in result:
        result.append(myDict)
print result

Output

[{'x': 'one', 'id': 1}, {'x': 'two', 'id': 2}, {'x': 'three', 'id': 3}]
thefourtheye
  • 221,210
  • 51
  • 432
  • 478
  • 2
    this one should be declared as an answer, how come 6 years later the author of the question didn't vote for this? – Suomynona Dec 05 '19 at 05:49
10

One possible way to define it:

lst1 + [x for x in lst2 if x not in lst1]
Out[24]: [{'id': 1, 'x': 'one'}, {'id': 2, 'x': 'two'}, {'id': 3, 'x': 'three'}]

Note that this will keep both {'id': 2, 'x': 'three'} and {'id': 2, 'x': 'two'} as you did not define what should happen in that case.

Also note that the seemingly-equivalent and more appealing

set(lst1 + lst2)

will NOT work since dicts are not hashable.

roippi
  • 24,635
  • 4
  • 45
  • 71
3

BTW, you can use 'pandas' for such calculations:

>>> import pandas as pd
>>> 
>>> lst1 = [{"id": 1, "x": "one"}, {"id": 2, "x": "two"}]
>>> lst2 = [{"id": 2, "x": "two"}, {"id": 3, "x": "three"}]
>>> 
>>> lst1_df = pd.DataFrame(lst1)
>>> lst2_df = pd.DataFrame(lst2)
>>> lst_concat_df = pd.concat([lst1_df, lst2_df])
>>> lst_grouped_res_df = lst_concat_df.groupby(["id", "x"]).agg(sum)
>>> print(lst_grouped_res_df.reset_index().to_dict('records'))

Output:

[{'id': 1, 'x': 'one'}, {'id': 2, 'x': 'two'}, {'id': 3, 'x': 'three'}]