I'm pretty new to pandas but I know some basic SQL.
I would like to know what is the equivalent way to achieve this in pandas. My goal is to write a np.select statement later and write specific actions dependent on the row number (rn). The value I would like to count is ['Other Identifier'] which has known duplicates.
Row_number() OVER ( PARTITION BY Other Identifier ORDER BY Date DESC ) as rn
I've tried doing it like this:
df['Other Identifier Ct'] = df.groupby(cols)['Other Identifier'].transform('size')
^ That seems to not be a "rolling count" and it just seems to sum up all duplicates. I need each row to show the # instance (1, 2, 3, etc..)
df['Other Identifier Ct'] = df.duplicated.['Other Identifier'].sum()
^ I ran into a syntax error into that one and can't actually run the output, but I think this would also not be rolling count. Could somebody please help advise? Thanks!