2

I am trying to do some simple manipulation of a pandas dataframe. I have imported pandas as pd and numpy as np and imported a csv to create a dataframe called 'dfe'.

I have had success with the following code to populate a new column based on one condition:

dfe['period'] = np.where(dfe['Time'] >= "07:30:00.000" , '1', '2')

But when I try to use a similar technique to populate the same column based on two conditions, I get an error (unsupported operand type(s) for &: 'bool' and 'str')

Here is my attempt at the multiple condition version:

dfe['period'] = np.where(dfe['Time'] >= "07:30:00.000" & dfe['Time'] <= "10:00:00.000" , '1', '2')

I have had a look at lots of solutions to similar problems but they are all a little bit too complicated for me to understand given I have just started and was hoping someone could give me some clues about why this is not working.

Thanks

Reblochon Masque
  • 33,202
  • 9
  • 48
  • 71
Mark D
  • 149
  • 3
  • 12

1 Answers1

10

You are close, () are missing because priority of operators:

dfe['period'] = np.where((dfe['Time'] >= "07:30:00.000") & 
                         (dfe['Time'] <= "10:00:00.000") , '1', '2')

Another solution with between:

dfe['period'] = np.where(dfe['Time'].between("07:30:00.000", "10:00:00.000") , '1', '2')
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090