1

I have a pandas data frame. eg:

df=
  paper id  year
0         3  1997
1         3  1999
2         3  1999
3         3  1999
4         6  1997
                so on

I want the maximum year corresponding to a paper id given as input. For example, if the paper id given is 3, I want 1999 as the answer.

How can I do this?

humble
  • 1,776
  • 3
  • 24
  • 33

1 Answers1

2

There are 2 general solutions - filter first and then get max:

s = df.loc[df['paper id'] == 3, 'year'].max()
print (s)
1999

s = df.set_index('paper id').loc[3, 'year'].max()
print (s)
1999

Or aggregate max to Series and then select by index values:

s = df.groupby('paper id')['year'].max()
print (s)
paper id
3    1999
6    1997
Name: year, dtype: int64

print (s[3])
1999
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090