0

A question that haunts me a little bit. Though it must be a common task to perform, I find it difficult to implement it easily in pandas

you have a df_ex of values of category and score. based on the score value, you want to lookup in a reference table df_ref another value, like score info. the lookup is range-based, i.e. [0-10[, [10-20[ etc...and depends on the category (i.e each category has its own range and score info)

ex:

df_ex['category']=['A','B','A','B','B','A']

df_ex['score']=[1,45,65,7,34,76]

*********************************************

df_ref['category']=['A','A','A','A','B','B','B']

df_ref['low_bound']= [0,25,50,75,0,33,66] # >=

df_ref['up_bound']= [25,50,75,100,33,66,100] # <

df_ref['score_info']= ['low','medium','high','very high','low','medium','high']

and a magic_lookup on df_ex would then return ['low','medium','high','low', 'medium','very high'].

I see the nice answer from Joe here Best way to join / merge by range in pandas using Numpy broadcast.

I wonder how to generalize this when having the category criteria in addition.

sammywemmy
  • 22,944
  • 4
  • 14
  • 28

0 Answers0