0

Given a Python dict of the form:

dict = {'Alice': 2341, 'Beth': 9102, 'Cecil': 3258, ......}

Is there an easy way to print the first x keys with the highest numeric values? That is, say:

Beth   9102
Cecil  3258

Currently this is my attempt:

max = 0
max_word = ""
for key, value in w.word_counts.iteritems():
    if value > max:
        if key not in stop_words:
            max = value
            max_word = key

print max_word
jscs
  • 63,095
  • 13
  • 148
  • 192
Superdooperhero
  • 6,876
  • 18
  • 75
  • 121
  • 1
    possible duplicate of [Python: Sort a dictionary by value](http://stackoverflow.com/questions/613183/python-sort-a-dictionary-by-value) – Zero Piraeus May 26 '14 at 22:52
  • 1
    You might consider using a [`Counter`](https://docs.python.org/2/library/collections.html#collections.Counter) instead of a dictionary initially. Then you have `word_counts.most_common(x)` – jscs May 26 '14 at 22:55

6 Answers6

7

I'd simply sort the items by the second value and then pick the first K elements :

d_items = sorted(d.items(), key=lambda x: -x[1])
print d_items[:2]
[('Beth', 9102), ('Cecil', 3258)]

The complexity of this approach is O(N log N + K), not that different from optimal O(N + K log K) (using QuickSelect and sorting just the first K elements).

Danstahr
  • 4,060
  • 19
  • 36
5

Using collections.Counter.most_common:

>>> from collections import Counter
>>> d = {'Alice': 2341, 'Beth': 9102, 'Cecil': 3258}
>>> c = Counter(d)
>>> c.most_common(2)
[('Beth', 9102), ('Cecil', 3258)]

It uses sorted (O(n*log n)), or heapq.nlargest(k) that might be faster than sorted if k << n, or max() if k==1.

jfs
  • 374,366
  • 172
  • 933
  • 1,594
3
>>> (sorted(dict.items(), key=lambda x:x[1]))[:2]
[('Alice', 2341), ('Cecil', 3258)]
Shan Valleru
  • 3,065
  • 1
  • 21
  • 20
1
items = sorted(w.word_counts.items(), lambda x, y: cmp(x[1], y[1]), None, True) 
items[:5]

Replace 5 with the number of elements you want to get.

Lachezar
  • 6,284
  • 3
  • 32
  • 34
1
d = {'Alice': 2341, 'Beth': 9102, 'Cecil': 3258}

vs = sorted(d, key=d.get,reverse=True)

l = [(x,d.get(x)) for x in vs[0:2]]
n [4]: l
Out[4]: [('Beth', 9102), ('Cecil', 3258)]
Padraic Cunningham
  • 168,988
  • 22
  • 228
  • 312
0

Convert dict to list of tuples [(2341, 'Alice'), ...] then sort it (without key=lambda ...).

furas
  • 119,752
  • 10
  • 94
  • 135