Accessing values in a list of nested dictionaries

Question

My dictionary looks like this:

docScores = {0:[{u'word':2.3},{u'the':8.7},{u'if':4.1},{u'Car':1.7}],
             1:[{u'friend':1.2},{u'a':5.2},{u'you':3.8},{u'person':0.8}],
             ...
             29:[{u'yard':1.5},{u'gardening':2.8},{u'paint':3.7},{u'brush':1.6}]
            }

I want to sum the values of each inner dict for each list and store it in a new dict, with the new dict having key values of {0:2.3+8.7+4.1+1.7, 1:1.2+5.2+3.8+0.8, ... etc} i.e.

for x in docScores[0]: #{0:
    for x in docScores[0][0].values(): #{,2.3}.
        sum = sum+x #where sum = 0 before loop
        docSum[0] = sum
    repeat this loop for every document

Any variation that I have tried is giving me unexpected outputs. Can anyone give me the correct syntax for this?

Can you post what you expect the result to be for this? do you want `{0: 2.3+8.7+4.1+1.7, 1: 1.2+5.2_3.8+0.8, ... 29: 1.5+2.8+3.7+1.6}` ? — mgilson, Aug 03 '12 at 14:12
This calls for one amazing list comprehension... It's a pity that I don't have time to write it now. — BrtH, Aug 03 '12 at 14:13
Why the list of `dict`s, each holding a single element? That looks like a terrible waste of memory. — Fred Foo, Aug 03 '12 at 14:15
@larsmans I'm still learning python and dicts have been my biggest stumbling block. I'm working on a large program and to get to the point I'm at now it was the only way I could get the values as they are. What would you recommend as an alternative that I could look into? — adohertyd, Aug 03 '12 at 14:17
@adohertyd: instead of `[{u'word':2.3},{u'the':8.7}]`, I would suggest `{u'word':2.3, u'the':8.7}`. I.e., a single `dict` per document. — Fred Foo, Aug 03 '12 at 14:20
@larsmans is *exactly* right. What's more, instead of each integer having its own key in the dictionary, just have it as a list. So instead of a dict of lists of dicts, you'll just have a list of dicts! — David Robinson, Aug 03 '12 at 14:26

dawg · Accepted Answer · 2012-08-03T15:33:20.697

This dict comprehension works:

docScores = {0:[{u'word':2.3},{u'the':8.7},{u'if':4.1},{u'Car':1.7}],
             1:[{u'friend':1.2},{u'a':5.2},{u'you':3.8},{u'person':0.8}],
             29:[{u'yard':1.5},{u'gardening':2.8},{u'paint':3.7},{u'brush':1.6}]
            }

sum_d={k:sum(d.values()[0] for d in v) for k,v in docScores.items()}

print sum_d

Prints:

{0: 16.8, 1: 11.0, 29: 9.6}

However, changing your data structure may be easier. You could have a dict of dicts:

>>> NdocScores = {0:{u'word':2.3,u'the':8.7,u'if':4.1,u'Car':1.7},
...              1:{u'friend':1.2,u'a':5.2,u'you':3.8,u'person':0.8},
...              29:{u'yard':1.5,u'gardening':2.8,u'paint':3.7,u'brush':1.6}
...             }

Which allows each docs data to be directly accessed:

>>> NdocScores[0]
{u'Car': 1.7, u'the': 8.7, u'word': 2.3, u'if': 4.1}
>>> NdocScores[0][u'Car']
1.7
>>> sum(NdocScores[1].values())
11.0

>>> NdocScores[29]
{u'gardening': 2.8, u'yard': 1.5, u'brush': 1.6, u'paint': 3.7}

Or, just have a list of dicts with the position in the list corresponding to the doc index:

>>> lofdicts=[v for k,v in NdocScores.items()]
>>> lofdicts
[{u'Car': 1.7, u'the': 8.7, u'word': 2.3, u'if': 4.1}, {u'a': 5.2, u'person': 0.8, u'you': 3.8, u'friend': 1.2}, {u'gardening': 2.8, u'yard': 1.5, u'brush': 1.6, u'paint': 3.7}]
>>> lofdicts[0]
{u'Car': 1.7, u'the': 8.7, u'word': 2.3, u'if': 4.1}
>>> sum(lofdicts[1].values())
11.0

Note that this solution only works on python2.7. prior to 2.7, dict comprehensions didn't exist. after 2.7, `dict.values()` no longer returns an indexable object and `dict.iteritems()` no longer exists. — mgilson, Aug 03 '12 at 14:35

mgilson · Answer 2 · 2012-08-03T14:33:35.820

2

new_dict={}

docScores = {0:[{u'word':2.3},{u'the':8.7},{u'if':4.1},{u'Car':1.7}],
             1:[{u'friend':1.2},{u'a':5.2},{u'you':3.8},{u'person':0.8}],
             29:[{u'yard':1.5},{u'gardening':2.8},{u'paint':3.7},{u'brush':1.6}]
            }

for k,v in docScores.items():
    new_dict[k]=sum( sum(d.values()) for d in v )

print (new_dict) #{0: 16.8, 1: 11.0, 29: 9.6}

As others have mentioned, you could make this into a dictionary comprehension (python 2.7+):

new_dict = {k : sum( sum(d.values()) for d in v ) for k,v in docScores.items() }

But at this point I think that the comprehension is getting very difficult to comprehend (and therefore I wouldn't do it).

Also, someone should probably point out that if all your dictionary keys are sequential integers starting from 0 and going to 29, You probably shouldn't be using a dictionary to store this data -- maybe a list would be more appropriate ...

EDIT

using a list:

new_list = [sum( sum(d.values()) for d in v ) for _,v in sorted(docScores.items()) ]

edited Aug 03 '12 at 14:33

answered Aug 03 '12 at 14:14

mgilson

283,004
58
591
667

I'm using Python 2.7 if that's going to make a difference? – adohertyd Aug 03 '12 at 14:18
@adohertyd -- I've updated my answer to be python2 and python3 compatible. – mgilson Aug 03 '12 at 14:20
@adohertyd -- one new update `sum(d[kk] for kk in d)` was stupid -- I don't know why I used that ... `sum(d.values())` is much better. – mgilson Aug 03 '12 at 14:21
thank you works great. How would I transform this to a list instead as you recommend? I'm more comfortable working with lists anyway – adohertyd Aug 03 '12 at 14:25
@adohertyd: You should generate the original data as a list rather than create it as a dict and transform it. However, `[docScores[i] for i in range(0, max(docScores.keys()) + 1)]` would do the trick. – David Robinson Aug 03 '12 at 14:27
@adohertyd -- I've updated the solution to use a list. Note to transform a dict with your structure to a list, `[v for _,k in sorted(your_dict.items()]` will do the trick. – mgilson Aug 03 '12 at 14:30

jamylak · Answer 3 · 2012-08-03T14:23:19.980

>>> doc_scores = {
        0: [{u'word': 2.3}, {u'the': 8.7}, {u'if': 4.1}, {u'Car': 1.7}],
        1: [{u'friend': 1.2}, {u'a': 5.2}, {u'you': 3.8}, {u'person': 0.8}],
        29: [{u'yard': 1.5}, {u'gardening': 2.8}, {u'paint': 3.7}, {u'brush': 1.6}]
}
>>> dict((k, sum(n for d in v for n in d.itervalues())) 
         for k, v in doc_scores.iteritems())
{0: 16.8, 1: 11.0, 29: 9.6}

If you only have one value in each of the dicts in the lists you can shorten this:

>>> dict((k, sum(d.values()[0] for d in v)) for k, v in doc_scores.iteritems())
{0: 16.8, 1: 11.0, 29: 9.6}

score 1 · Answer 4 · answered Aug 03 '12 at 14:20

1

And more oneline solve )

sum(reduce(lambda x, y: x+y, [d.values() for d in v for _,v in docScores.iteritems()]))

answered Aug 03 '12 at 14:20

Denis

6,591
6
36
57

Opppps, I have not read the problem correctly, but this code sum all your coeffs for all structure. )) – Denis Aug 03 '12 at 14:22

score 0 · Answer 5 · answered Aug 03 '12 at 14:26

docScores = {0:[{u'word':2.3},{u'the':8.7},{u'if':4.1},{u'Car':1.7}],
             1:[{u'friend':1.2},{u'a':5.2},{u'you':3.8},{u'person':0.8}],
             2:[{u'yard':1.5},{u'gardening':2.8},{u'paint':3.7},{u'brush':1.6}]
            }


result = dict(enumerate(sum (sum(word.values()) for word in  word_list[1]) for word_list in sorted(docScores.items())  ) )

Accessing values in a list of nested dictionaries

5 Answers5

Linked