47

I was wondering what would be a Pythonic way of sorting a list of tuples by two keys whereby sorting with one (and only one) key would be in a reverse order and sorting with the the other would be case insensitive. More specifically, I have a list containing tuples like:

myList = [(ele1A, ele2A),(ele1B, ele2B),(ele1C, ele2C)]

I can use the following code to sort it with two keys:

sortedList = sorted(myList, key = lambda y: (y[0].lower(), y[1]))

To sort in reverse order I can use

sortedList = sorted(myList, key = lambda y: (y[0].lower(), y[1]), reverse = True)

But this would sort in a reverse order with two keys.

martineau
  • 112,593
  • 23
  • 157
  • 280
  • The special case (all keys should be sorted in the same order) is [python - Sort a list by multiple attributes? - Stack Overflow](https://stackoverflow.com/questions/4233476/sort-a-list-by-multiple-attributes) -- although it also have [some comment](https://stackoverflow.com/questions/4233476/sort-a-list-by-multiple-attributes#comment42711635_4233482) explaining how to sort in different order. – user202729 Aug 14 '21 at 13:35

6 Answers6

56

Two keys will be used when we need to sort a list with two constraints: one in ascending order and the other in descending, in the same list or any

In your example,

sortedList = sorted(myList, key = lambda y: (y[0].lower(), y[1]))

you can sort entire list only in one order.

You can try these and check what's happening:

sortedList = sorted(myList, key = lambda y: (y[0].lower(), -y[1]))
sortedList = sorted(myList, key = lambda y: (-y[0].lower(), y[1]))
sortedList = sorted(myList, key = lambda y: (-y[0].lower(), -y[1]))
wjandrea
  • 23,210
  • 7
  • 49
  • 68
Venkat Ramana
  • 783
  • 6
  • 13
44

You could create a reversor class and use it to decorate the key in question. This class could be used to reverse any field that is comparable.

class reversor:
    def __init__(self, obj):
        self.obj = obj

    def __eq__(self, other):
        return other.obj == self.obj

    def __lt__(self, other):
           return other.obj < self.obj

Use it like so:

sortedList = sorted(myList, key=lambda y: (y[0].lower(), reversor(y[1])))
black panda
  • 2,674
  • 1
  • 18
  • 27
  • 7
    This solution can be used with strings or other objects. Its a cleaner solution than the one that's tagged as best solution. – Tim Givois Sep 26 '19 at 00:28
  • Better decorate this with `functools.total_ordering`: https://docs.python.org/3/library/functools.html#functools.total_ordering – kaya3 Dec 19 '19 at 12:19
  • 4
    Total ordering is unnecessary for use as a key parameter in the sorted function. You only need == and – black panda Dec 19 '19 at 15:57
  • Definitely the best answer all around. For n00bs, if you are dealing with list of dicts, use `y[]`. For list of objects use `y.`. – Timothy C. Quinn May 02 '22 at 01:31
  • 1
    what's with all the rigamarole around `None`? Seems that shouldn't be dealt with at all in this method. – juanpa.arrivillaga May 02 '22 at 04:12
  • If the data has a None, you will get an error like TypeError: '>' not supported between instances of 'NoneType' and ''. That None check gets around that issue and handles them gracefully. – Timothy C. Quinn May 02 '22 at 16:13
  • There is no reason to work around None for the general purpose. The 'sorted' method itself does not accept None, for the very reason that None is not comparable. That is a special case situation that has nothing to do with the question or the answer, in my opinion. – black panda May 05 '22 at 03:26
4

Sometimes there is little alternative but to use a comparator function. There was a cmp argument to sorted from its introduction to 2.4, but it was removed from Python 3 in favour of the more efficient key function. In 3.2, cmp_to_key was added to functools; it creates keys from the original objects by wrapping them in an object whose comparison function is based on the cmp function. (You can see a simple definition of cmp_to_key at the end of the Sorting How-To

In your case, since lower-casing is relatively expensive, you might want to do a combination:

class case_insensitive_and_2nd_reversed:
    def __init__(self, obj, *args):
        self.first = obj[0].lower()
        self.second = obj[1]
    def __lt__(self, other):
        return self.first < other.first or self.first == other.first and other.second < self.second
    def __gt__(self, other):
        return self.first > other.first or self.first == other.first and other.second > self.second
    def __le__(self, other):
        return self.first < other.first or self.first == other.first and other.second <= self.second
    def __ge__(self, other):
        return self.first > other.first or self.first == other.first and other.second >= self.second
    def __eq__(self, other):
        return self.first == other.first and self.second == other.second
    def __ne__(self, other):
        return self.first != other.first and self.second != other.second

sortedList = sorted(myList, key = case_insensitive_and_2nd_reversed)
Locane
  • 2,646
  • 2
  • 22
  • 32
rici
  • 219,119
  • 25
  • 219
  • 314
3

Method 1

A simple solution, but might not be the most efficient is to sort twice: the first time using the second element, the second using the first element:

sortedList = sorted(sorted(myList, key=lambda (a,b):b, reverse=True), key=lambda(a,b):a)

Or break down:

tempList = sorted(myList, key=lambda (a,b):b, reverse=True)
sortedList = sorted(tempList, key=lambda(a,b):a))

Method 2

If your elements are numbers, you can cheat a little:

sorted(myList, key=lambda(a,b):(a,1.0/b))

Method 3

I recommend against this approach as it is messy and the cmp keyword is not available in Python 3.

Another approach is to swap the elements when comparing the elements:

def compare_func(x, y):
    tup1 = (x[0], y[1])
    tup2 = (x[1], y[0])
    if tup1 == tup2:
        return 0
    elif tup1 > tup2:
        return 1
    else:
        return -1

sortedList = sorted(myList, cmp=compare_func)

Or, using lambda to avoid writing function:

sortedList = sorted(
    myList,
    cmp=lambda (a1, b1), (a2, b2): 0 if (a1, b2) == (a2, b1) else 1 if (a1, b2) > (a2, b1) else -1
    )
wjandrea
  • 23,210
  • 7
  • 49
  • 68
Hai Vu
  • 34,189
  • 11
  • 57
  • 87
1

maybe elegant but not efficient way:

reverse_key = functools.cmp_to_key(lambda a, b: (a < b) - (a > b))
sortedList = sorted(myList, key = lambda y: (reverse_key(y[0].lower()), y[1]))
Yankai Zhang
  • 131
  • 2
  • 9
1

When using Python 3, @KellyBundy made an excellent observation that the multisort method listed in the current python docs is incredibly fast and be used to accomplish multi-colum sort with discrete ordering. Here is a NoneType safe version:

students = [
     {'idx': 0, 'name': 'john', 'grade': 'A', 'attend': 100}
    ,{'idx': 1, 'name': 'jane', 'grade': 'B', 'attend': 80}
    ,{'idx': 2, 'name': 'dave', 'grade': 'B', 'attend': 85}
    ,{'idx': 3, 'name': 'stu' , 'grade': None, 'attend': 85}
]

def key_grade(student):
    grade = student['grade']
    return grade is None, grade
def key_attend(student):
    attend = student['attend']
    return attend is None, attend
students_sorted = sorted(students, key=key_attend)
students_sorted.sort(key=key_grade, reverse=True)

Notes:

  • The <variable> is None, check is defensive check so that search does not fail on None values
  • Although, this does multiple sorted calls, it is hands down the fastest multi-sort method!

I have created a new Python Project called multisort which exposes three methodologies:

Method Descr Notes speed
multisort Simple one-liner designed after multisort example from python docs Second fastest of the bunch but most configurable and easy to read. 0.0035
cmp_func Multi column sorting in the model java.util.Comparator Reasonable speed 0.0138
reversor Implementation of reversor - See Black Panda's answer Pretty slow methodology 0.0370

For reference:

Method speed
KellyBundy's Multisort 0.0005
pandas 0.0079

Note: Speed is average of 10 runs for 1000 rows with 4 columns.

Example of multisort from the multisort library :

from multisort import multisort
rows_sorted = multisort(rows_dict, [
        ('grade', True, lambda s:None if s is None else s.upper()),
        'attend',
], reverse=True)

However, for developers who come in from Java, here is an example that is similar to java.util.Comparator for use in Python 3:

from multisort import cmp_func

def cmp_student(a,b):
    k='grade'; va=a[k]; vb=b[k]
    if va != vb:
        if va is None: return -1
        if vb is None: return 1
        return -1 if va > vb else 1
    k='attend'; va=a[k]; vb=b[k]; 
    if va != vb:
        return -1 if va < vb else 1
    return 0

students_sorted = sorted(students, key=cmp_func(cmp_student))
Timothy C. Quinn
  • 2,756
  • 1
  • 30
  • 39
  • How much slower was the method from Python's Sorting HOW TO? – Kelly Bundy May 02 '22 at 05:04
  • Please provide the example from Python docs you are mentioning and I'll add it to my tests. – Timothy C. Quinn May 02 '22 at 05:44
  • https://docs.python.org/3/howto/sorting.html#sort-stability-and-complex-sorts – Kelly Bundy May 02 '22 at 05:51
  • None of the examples provided there can handle None in values so they cannot be used for real world data unless you cleanse the data first so it has no None values. However, I did write up a test for the multisort. I'll add it to my gits tests. – Timothy C. Quinn May 02 '22 at 06:19
  • Am I mistaken, or does [msorted](https://github.com/JavaScriptDude/multisort/blob/5aa62aff909fb492df7ac0da7cbf3b28d33bd059/src/multisort/multisort.py#L82-L87) only do *one* `sorted` instead of multiple like Python's HOW TO does? – Kelly Bundy May 04 '22 at 23:08
  • And I just tried it with actual multiple sorts and it's about [4.6 times **faster**](https://ideone.com/9gmBHc) than your method (building `students` with code from your project). – Kelly Bundy May 04 '22 at 23:27
  • @KellyBundy - Interesting, I'll go do some testing with that methodology in mind. Can you raise an issue in the `multisort` repo with your test code? Appreciate it. – Timothy C. Quinn May 04 '22 at 23:29
  • 5.7 times faster with [different keys](https://ideone.com/WyfYU3). – Kelly Bundy May 04 '22 at 23:38
  • [6.5 times faster](https://ideone.com/JKJRyv) :-). I also tried this optimization with yours, but that actually slowed it down further. – Kelly Bundy May 04 '22 at 23:52
  • Oh I just saw the request to raise an issue in the repo. Not really in the mood to do that. Feel free to use my code to do it yourself / however you want. – Kelly Bundy May 04 '22 at 23:57
  • Got your code. Definitely smokin' fast. Thanks! – Timothy C. Quinn May 05 '22 at 00:02