find max value in a list of sets for an element at index 1 of sets

Question

I have a list like this:

dummy_list = [(8, 'N'),
 (4, 'Y'),
 (1, 'N'),
 (1, 'Y'),
 (3, 'N'),
 (4, 'Y'),
 (3, 'N'),
 (2, 'Y'),
 (1, 'N'),
 (2, 'Y'),
 (1, 'N')]

and would like to get the biggest value in 1st column of the sets inside where value in the 2nd column is 'Y'.

How do I do this as efficiently as possible?

score 4 · Accepted Answer · answered Apr 29 '22 at 10:14

You can use max function with generator expression.

>>> dummy_list = [(8, 'N'),
...  (4, 'Y'),
...  (1, 'N'),
...  (1, 'Y'),
...  (3, 'N'),
...  (4, 'Y'),
...  (3, 'N'),
...  (2, 'Y'),
...  (1, 'N'),
...  (2, 'Y'),
...  (1, 'N')]
>>>
>>> max(first for first, second in dummy_list if second == 'Y')
4

score 1 · Answer 2 · answered Apr 29 '22 at 10:22

You can use pandas for this as the data you have resembles a table.

import pandas as pd

df = pd.DataFrame(dummy_list, columns = ["Col 1", "Col 2"]) 
val_y = df[df["Col 2"] == "Y"]
max_index = val_y["Col 1"].idxmax()

print(df.loc[max_index, :])

First you convert it into a pandas dataframe using pd.DataFrame and set the column name to Col 1 and Col 2.

Then you get all the rows inside the dataframe with Col 2 values equal to Y.

Once you have this data, just select Col 1 and apply the idxmax function on it to get the index of the maximum value for that series.

You can then pass this index inside the loc function as the row and : (every) as the column to get the whole row.

It can be compressed to two lines in this way,

max_index = df[df["Col 2"] == "Y"]["Col 1"].idxmax()
df.loc[max_index, :]

Output -

Col 1    4
Col 2    Y
Name: 1, dtype: object

score 0 · Answer 3 · answered Apr 29 '22 at 10:16

0

max([i[0] for i in dummy_list if i[1] == 'Y'])

answered Apr 29 '22 at 10:16

TDT

1
1

The `[ ] ` are not needed here – LinFelix Apr 29 '22 at 10:18
1

A short explanation might be helpful here as well. – BrokenBenchmark Apr 30 '22 at 05:11

score 0 · Answer 4 · answered Apr 29 '22 at 10:24

0


max([i for i in dummy_list if i[1] == 'Y'])

output: (4, 'Y')

or


max(filter(lambda x: x[1] == 'Y', dummy_list))

output: (4, 'Y')

answered Apr 29 '22 at 10:24

Will

772
1
5
20

Can you explain what `filter` actually does? Is it like `map`? – Zero Apr 29 '22 at 11:28

cards · Answer 5 · 2022-05-01T11:59:17.843

0

By passing a callback function to max to get a finer search, no further iterations are required.

y_max = max(dummy_list, key=lambda p: (p[0], 'Y'))[0]
print(ymax)

By decoupling the pairs and classify them wrt to the Y,N values

d = {}
for k, v in dummy_list:
    d.setdefault(v, []).append(k)

y_max = max(d['Y'])

By a zip-decoupling one can use a mask-like approach using itertools.compress

values, flags = zip(*dummy_list)
y_max = max(it.compress(values, map('Y'.__eq__, flags)))
print(y_max)

edited May 01 '22 at 11:59

answered Apr 29 '22 at 11:16

cards

2,194
1
3
21

Your answer is misleading: `key` does _not filter_, it's used to _sort_. Your approach leads therefore also to wrong results: Try it with `dummy_list = [(8, 'N'), (1, 'Y'), (4, 'Y')]` (result is `1`, not `4`). – Timus May 01 '22 at 09:42
I think you're making the same mistake in the first solution: `sorted` uses `key` to sort via the result of `key`. So the only thing `sorted` does is shifting the tuples in 2 blocks (first `N`, then `Y`), without changing the relative order within the blocks? Result of your `sorted` is: `[(8, 'N'), (1, 'N'), (3, 'N'), (3, 'N'), (1, 'N'), (1, 'N'), (4, 'Y'), (1, 'Y'), (4, 'Y'), (2, 'Y'), (2, 'Y')]`. – Timus May 01 '22 at 11:34
@Timus Oh yes, you are completely right! I had many other possibilities in my head that I totally mixed-up the stuffs. (I think then a double sort will fix it... but, lets say, not very nice). Thanks for your patience – cards May 01 '22 at 11:39

find max value in a list of sets for an element at index 1 of sets

5 Answers5