1

Suppose I have a dataframe containing a column of probability. Now I create a map function which returns 1 if the probability is >a threshold value other wise returns 0. Now the catch is that I want to specify the threshold by giving it as an argument to function, and then mapping it on the pandas dataframe.

Take the code example below:

def partition(x,threshold):
    if x<threshold:
        return 0
    else:
        return 1

df=pd.DataFrame({'probability':[0.2,0.8,0.4,0.95]})
df2=df.map(partition) #how would this line work - is my doubt to be exact

I.e. how do I pass the threshold value inside my map function now?

Rishabh Rao
  • 79
  • 1
  • 5

2 Answers2

2

We can use Dataframe.applymap

df2 = df.applymap(lambda x: partition(x, threshold=0.5))

Or if only one column:

df['probability']=df['probability'].apply(lambda x: partition(x, threshold=0.5))

but it is not neccesary here. You can do:

df2 = df.ge(threshold).astype(int)

I recommend you see it

ansev
  • 28,746
  • 5
  • 11
  • 29
0

You can use lambda for that purpose:

def partition(x,threshold):
    if x<threshold:
        return 0
    else:
        return 1

df=pd.DataFrame({'probability':[0.2,0.8,0.4,0.95]})
df['probability']=df['probability'].map(lambda x: partition(x, threshold=0.5))
Grzegorz Skibinski
  • 12,152
  • 2
  • 9
  • 32