9

I have the following string list. Then, I want to sort it by a number in each element. sorted failed because it cannot handle the order such as between 10 and 3. I can imagine if I use re, I can do it. But it is not interesting. Do you guys have nice implementation ideas? I suppose python 3.x for this code.

names = [
'Test-1.model',
'Test-4.model',
'Test-6.model',
'Test-8.model',
'Test-10.model',
'Test-20.model'
]
number_sorted = get_number_sorted(names)
print(number_sorted)
'Test-20.model'
'Test-10.model'
'Test-8.model'
'Test-6.model'
'Test-4.model'
'Test-1.model'
jpp
  • 147,904
  • 31
  • 244
  • 302
jef
  • 3,601
  • 6
  • 35
  • 70

7 Answers7

7

the key is ... the key

sorted(names, key=lambda x: int(x.partition('-')[2].partition('.')[0]))

Getting that part of the string recognized as the sort order by separating it out and transforming it to an int.

Back2Basics
  • 6,864
  • 1
  • 32
  • 42
4

Some alternatives:

(1) Slicing by position:

sorted(names, key=lambda x: int(x[5:-6]))

(2) Stripping substrings:

sorted(names, key=lambda x: int(x.replace('Test-', '').replace('.model', '')))

(3) Splitting characters (also possible via str.partition):

sorted(names, key=lambda x: int(x.split('-')[1].split('.')[0]))

(4) Map with np.argsort on any of (1)-(3):

list(map(names.__getitem__, np.argsort([int(x[5:-6]) for x in names])))
jpp
  • 147,904
  • 31
  • 244
  • 302
  • Given the goal is to sort the original strings, not just get the sorted numbers, using a `key` function to perform the transform would make more sense (and avoid an unnecessary genexpr), e.g. for your first example, `sorted(names, key=lambda x: int(x[5:-6]))`, or for your second `sorted(names, key=lambda x: int(x.replace('Test-', '').replace('.model', '')))` – ShadowRanger Jan 24 '18 at 02:08
  • @ShadowRanger, yep I realise this now. I have edited my answer. – jpp Jan 24 '18 at 02:12
  • I like the multiple options now. That is innovative. – Back2Basics Jan 24 '18 at 02:13
3

I found a similar question and a solution by myself. Nonalphanumeric list order from os.listdir() in Python

import re
def sorted_alphanumeric(data):
    convert = lambda text: int(text) if text.isdigit() else text.lower()
    alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 
    return sorted(data, key=alphanum_key, reverse=True)
jef
  • 3,601
  • 6
  • 35
  • 70
2

You can use re.findall in with the key of the sort function:

import re
names = [
 'Test-1.model',
 'Test-4.model',
 'Test-6.model',
 'Test-8.model',
 'Test-10.model',
 'Test-20.model'
]
final_data = sorted(names, key=lambda x:int(re.findall('(?<=Test-)\d+', x)[0]), reverse=True)

Output:

['Test-20.model', 'Test-10.model', 'Test-8.model', 'Test-6.model', 'Test-4.model', 'Test-1.model']
Ajax1234
  • 66,333
  • 7
  • 57
  • 95
1

Here is a regex based approach. We can extract the test number from the string, cast to int, and then sort by that.

import re

def grp(txt): 
    s = re.search(r'Test-(\d+)\.model', txt, re.IGNORECASE)
    if s:
        return int(s.group(1))
    else:
        return float('-inf')  # Sorts non-matching strings ahead of matching strings

names.sort(key=grp)
Tim Biegeleisen
  • 451,927
  • 24
  • 239
  • 318
  • This still sorts string style (lexicographically), not numerically. You'd want to return `int(s.group(1))` in the first case, and some filler numerical value (e.g. `float('-inf')` to sort put strings not matching the pattern at the front of the resulting `list`), not `str`, in the `else` case. – ShadowRanger Jan 24 '18 at 02:10
  • @ShadowRanger No, even [making those changes](http://rextester.com/TKKP68517) still doesn't fix it. I don't know Python, by the way. Feel free to edit this. – Tim Biegeleisen Jan 24 '18 at 02:11
  • @TimBiegeleisen: `list.sort` runs in place and returns `None` (which means "has no return value"). Your test code reassigns `names` to `None` by assigning the result of `names.sort`, which is why it breaks. I removed the `names = ` from `names = names.sort(key=lambda l: grp(l))` (and simplified to `names.sort(key=grp)`; no `lambda` wrapper needed since `grp` already has the correct prototype) and [it works fine](http://rextester.com/IQNKL87746). – ShadowRanger Jan 24 '18 at 02:15
1
def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""

and then do something like

 sorted(names, key=lambda x: int(find_between(x, 'Test-', '.model')))
Claudiordgz
  • 2,993
  • 1
  • 21
  • 46
1

You can use the key parameter along with sorted() to accomplish this, assuming each string is formatted the same way:

def get_number_sorted(somelist):
    return sorted(somelist, key=lambda x: int(x.split('.')[0].split('-')[1]))

It looks like you might want your list reverse sorted (?), in which case you can add reverse=True as such:

def get_number_sorted(somelist):
    return sorted(somelist, key=lambda x: int(x.split('.')[0].split('-')[1]), reverse=True)
number_sorted = get_number_sorted(names)
print(number_sorted)
['Test-20.model', 'Test-10.model', 'Test-8.model', 'Test-6.model', 'Test-4.model', 'Test-1.model']

See related: Key Functions

x1084
  • 310
  • 1
  • 11