2

Am I able to define my own sort order and still use the Python sort function? For example, say my desired sorting order is:

list1 = ['Stella', 'Bob', 'Joe', 'Betty']

I'd hoped to do something like:

list2 = ['Joe', 'Stella']
sorted(list2, key=list1)

and get:

['Stella', 'Joe']

where the sorted order stays within the custom order in list1. I can do it in multiple steps by swapping out for numeric values but thought there may be a way to do it similar to above.

Selcuk
  • 52,758
  • 11
  • 94
  • 99
sobrio35
  • 133
  • 1
  • 8

4 Answers4

4

Note that for the example you have given you are wasting time sorting list2 as you can simply use the already sorted list1 if the items in list2 are unique:

print([e for e in list1 if e in list2])

should print

['Stella', 'Joe']

To improve it even more, you can use a set for faster lookups:

set2 = set(list2)
print([e for e in list1 if e in set2])

This second variant works in O(n) which is faster than any sorting algorithm.

Edit: The answers given by Nick and Jab are close to O(n*m + n*log n). This is a O(n * log n) solution that works for non-unique cases:

lookup_dict = {k: i for i, k in enumerate(list1)}
print(sorted(list2, key=lookup_dict.get))
Selcuk
  • 52,758
  • 11
  • 94
  • 99
  • Subject to the unique caveat this is a great solution. – Nick Jul 27 '20 at 01:52
  • @Selcuk, In my particular case, they won't always be unique, which is why I marked answer below. Nice method though in unique case. – sobrio35 Jul 27 '20 at 02:10
  • @sobrio35 Fair enough. I have added an alternative solution which should work in all cases for future reference. – Selcuk Jul 27 '20 at 02:47
  • I've shown an O(n) solution that works for non-unique cases – Mad Physicist Jul 27 '20 at 15:47
  • @MadPhysicist In OP's case `list2` contains non-unique elements. Your solution (and mine) won't work. – Selcuk Jul 27 '20 at 23:47
  • @HeapOverflow Sorry, that should have been `O(n^2 * log n)` because sorting is `O(n * log n)` and `list1.index` is O(n) (assuming that `len(list1) ~= len(list2)`). – Selcuk Jul 27 '20 at 23:49
  • @HeapOverflow The `key` argument is a function. The `sort` implementation must be calling it every time it needs to look up the index of an element. – Selcuk Jul 27 '20 at 23:53
  • @HeapOverflow Right, that makes sense. Even in that case it is O(n*m) for the key calculation plus O(n log n) for the actual sorting. – Selcuk Jul 27 '20 at 23:59
  • @HeapOverflow Only the Sith deal in absolutes. :) – Selcuk Jul 28 '20 at 00:20
  • @MadPhysicist My bad, hadn't seen that method. You are absolutely right. I guess you could've also used a list comprehension instead of extending it. – Selcuk Jul 28 '20 at 06:03
2

There's no need to use a lambda in the sorted function. Just supply it the index function of list1:

sorted(list2, key=list1.index)
Mad Physicist
  • 95,415
  • 23
  • 151
  • 231
Jab
  • 25,138
  • 21
  • 72
  • 111
2

If list1 and list2 are both guaranteed to contain only unique elements, you can turn list2 into a set and select from list1 based on it:

set2 = set(list2)
list2_sorted = [x for x in list1 if x in set2]

If, on the other hand list2 contains non-unique elements, you can use collections.Counter to do the same thing while maintaining optimal algorithmic complexity.

counter2 = Counter(list2)
list2_sorted = []
for x in list1:
    list2_sorted.extend([x] * counter2[x])

The algorithmic complexity is the same here because python's list growing is amortized to provide O(n) behavior in the long term, and the keys of a Counter, or any other dict, are implemented as a hash-table with O(1) lookup, just like a set.

You can use a nested comprehension instead of the loop:

list2_sorted = [x for x in list1 for _ in range(counter2[x])]

You can use the set-like behavior of dictionaries to maintain O(n) behavior even if list1 contains duplicates, as long as the sort order is determined only by the first occurrence. In Python 3.6+ dictionaries are ordered, so you could use plain dict. Prior versions would require you to use collections.OrderedDict to obtain the same behavior:

list1_keys = OrderedDict.fromkeys(list1)
list2_sorted = [x for x in list1_keys for _ in range(counter2[x])]

The reason to use a dict when you are looking for set-like behavior is that set is not ordered.

Mad Physicist
  • 95,415
  • 23
  • 151
  • 231
1

You can use a lambda function in sorted, taking list1.index() of the value to get a sort order:

list1 = ['Stella', 'Bob', 'Joe', 'Betty']

list2 = ['Joe', 'Stella']
sorted(list2, key=lambda v:list1.index(v))

Output:

['Stella', 'Joe']
Nick
  • 123,192
  • 20
  • 49
  • 81