0

maybe someone could show me what i am doing wrong?

import pandas as pd
import operator

def calculate(A, B):
   if (A > 2 and B == True):
       Z = A * 10
   else:
       Z = A * 10000
   return Z

df = pd.DataFrame()
df['A'] = 1,2,3,4,5
df['B'] = True,True,False,False,True
df['C'] = calculate(df.A, df.B)

df

**Error:** `ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().`

Thanks a lot! I couldn't find a solution for my problem on stackoverflow. I am total beginner and started coding today. The Solutions provided by other Questions didn't helped me, sorry for double post.

Sociopath
  • 12,395
  • 17
  • 43
  • 69
Tyrus Rechs
  • 73
  • 3
  • 12
  • I am guessing you want an element-wise logical AND, in which case, here is your answer: https://stackoverflow.com/questions/21415661/logic-operator-for-boolean-indexing-in-pandas tldr: try changing `and` to `&` – Stev Feb 27 '18 at 11:35
  • thanks! but i didn't work. – Tyrus Rechs Feb 27 '18 at 11:38
  • 1
    Don't forget to put some brackets: `if (A > 2) & (B == True):`. Otherwise the order of operations is wrong and you still get the error. – Jeronimo Feb 27 '18 at 12:11

2 Answers2

1

I think need chain conditions by & for AND with numpy.where:

df = pd.DataFrame()
df['A'] = 1,2,3,4,5
df['B'] = True,True,False,False,True

df['C'] = np.where((df.A > 2) & df.B, df.A * 10, df.A * 10000)
print (df)
   A      B      C
0  1   True  10000
1  2   True  20000
2  3  False  30000
3  4  False  40000
4  5   True     50

Detail:

print ((df.A > 2) & df.B)
0    False
1    False
2    False
3    False
4     True
dtype: bool

But if need loopy slow solution (not recommended):

def calculate(A, B):
   if (A > 2 and B == True):
       Z = A * 10
   else:
       Z = A * 10000
   return Z

df['C'] = df.apply(lambda x: calculate(x.A, x.B), axis=1)
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090
0

Here is an alternative solution to your problem, which removes the (computationally expensive) need for defining an explicit function:

import pandas as pd

df = pd.DataFrame()
df['A'] = 1,2,3,4,5
df['B'] = True,True,False,False,True

df['C'] = df['A'] * 10000
df.loc[(df['A'] > 2) & df['B'], 'C'] /= 1000

#    A      B        C
# 0  1   True  10000.0
# 1  2   True  20000.0
# 2  3  False  30000.0
# 3  4  False  40000.0
# 4  5   True     50.0
jpp
  • 147,904
  • 31
  • 244
  • 302