-2

I have a list like this:

dummy_list = [(8, 'N'),
 (4, 'Y'),
 (1, 'N'),
 (1, 'Y'),
 (3, 'N'),
 (4, 'Y'),
 (3, 'N'),
 (2, 'Y'),
 (1, 'N'),
 (2, 'Y'),
 (1, 'N')]

and would like to get the biggest value in 1st column of the sets inside where value in the 2nd column is 'Y'.

How do I do this as efficiently as possible?

Naveen Reddy Marthala
  • 1,812
  • 2
  • 18
  • 44

5 Answers5

4

You can use max function with generator expression.

>>> dummy_list = [(8, 'N'),
...  (4, 'Y'),
...  (1, 'N'),
...  (1, 'Y'),
...  (3, 'N'),
...  (4, 'Y'),
...  (3, 'N'),
...  (2, 'Y'),
...  (1, 'N'),
...  (2, 'Y'),
...  (1, 'N')]
>>>
>>> max(first for first, second in dummy_list if second == 'Y')
4
Abdul Niyas P M
  • 12,736
  • 2
  • 17
  • 36
1

You can use pandas for this as the data you have resembles a table.

import pandas as pd

df = pd.DataFrame(dummy_list, columns = ["Col 1", "Col 2"]) 
val_y = df[df["Col 2"] == "Y"]
max_index = val_y["Col 1"].idxmax()

print(df.loc[max_index, :])

First you convert it into a pandas dataframe using pd.DataFrame and set the column name to Col 1 and Col 2.

Then you get all the rows inside the dataframe with Col 2 values equal to Y.

Once you have this data, just select Col 1 and apply the idxmax function on it to get the index of the maximum value for that series.

You can then pass this index inside the loc function as the row and : (every) as the column to get the whole row.

It can be compressed to two lines in this way,

max_index = df[df["Col 2"] == "Y"]["Col 1"].idxmax()
df.loc[max_index, :]

Output -

Col 1    4
Col 2    Y
Name: 1, dtype: object
Zero
  • 1,541
  • 1
  • 3
  • 13
0
max([i[0] for i in dummy_list if i[1] == 'Y'])
TDT
  • 1
  • 1
0

max([i for i in dummy_list if i[1] == 'Y'])

output: (4, 'Y')

or


max(filter(lambda x: x[1] == 'Y', dummy_list))

output: (4, 'Y')
Will
  • 772
  • 1
  • 5
  • 20
0

By passing a callback function to max to get a finer search, no further iterations are required.

y_max = max(dummy_list, key=lambda p: (p[0], 'Y'))[0]
print(ymax)

By decoupling the pairs and classify them wrt to the Y,N values

d = {}
for k, v in dummy_list:
    d.setdefault(v, []).append(k)

y_max = max(d['Y'])

By a zip-decoupling one can use a mask-like approach using itertools.compress

values, flags = zip(*dummy_list)
y_max = max(it.compress(values, map('Y'.__eq__, flags)))
print(y_max)
cards
  • 2,194
  • 1
  • 3
  • 21
  • Your answer is misleading: `key` does _not filter_, it's used to _sort_. Your approach leads therefore also to wrong results: Try it with `dummy_list = [(8, 'N'), (1, 'Y'), (4, 'Y')]` (result is `1`, not `4`). – Timus May 01 '22 at 09:42
  • I think you're making the same mistake in the first solution: `sorted` uses `key` to sort via the result of `key`. So the only thing `sorted` does is shifting the tuples in 2 blocks (first `N`, then `Y`), without changing the relative order within the blocks? Result of your `sorted` is: `[(8, 'N'), (1, 'N'), (3, 'N'), (3, 'N'), (1, 'N'), (1, 'N'), (4, 'Y'), (1, 'Y'), (4, 'Y'), (2, 'Y'), (2, 'Y')]`. – Timus May 01 '22 at 11:34
  • @Timus Oh yes, you are completely right! I had many other possibilities in my head that I totally mixed-up the stuffs. (I think then a double sort will fix it... but, lets say, not very nice). Thanks for your patience – cards May 01 '22 at 11:39