I have a data frame that has a column called Price. This column contains continuous numeric values. The custom labels are created using the following code:
df['Price_label']= ''
df.loc[(df['Price'].isnull()) ,
['Price_label']] = 'Not_available'
df.loc[(df['Price'] < 0) ,
['Price_label']] = 'Less than 0'
df.loc[(df['Price'] == 0) ,
['Price_label']] = 'Only 0'
df.loc[(df['Price'] == 1) ,
['Price_label']] = 'Only 1'
df.loc[(df['Price'] > 0) & (df['Price'] < 1),
['Price_label']] = 'Greater than 0 - Less than 1'
df.loc[(df['Price'] > 1) & (df['Price'] <= 121),
['Price_label']] = 'Greater than 1-121'
df.loc[(df['Price'] > 121) & (df['Price'] <= 12832),
['Price_label']] = 'Greater than 121-12832'
df.loc[(df['Price'] > 12832) & (df['Price'] <= 100000),
['Price_label']] = 'Greater than 12832-100k'
df.loc[(df['Price'] > 100000) ,
['Price_label']] = 'Greater than 100k'
What I want to do is use a YAML file that takes input as the column name and the conditions mentioned here in the code above and then parses it on pandas automatically. I'm trying to create something for an audience that is not that well-versed with pandas and a simple input sequence on the YAML file would be helpful