23

I want to delete rows when a few conditions are met:

For instance, a random DataFrame is generated:

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10, 4), columns=['one', 'two', 'three', 'four'])
print df

one instance of table is shown as below:

        one       two     three      four
0 -0.225730 -1.376075  0.187749  0.763307
1  0.031392  0.752496 -1.504769 -1.247581
2 -0.442992 -0.323782 -0.710859 -0.502574
3 -0.948055 -0.224910 -1.337001  3.328741
4  1.879985 -0.968238  1.229118 -1.044477
5  0.440025 -0.809856 -0.336522  0.787792
6  1.499040  0.195022  0.387194  0.952725
7 -0.923592 -1.394025 -0.623201 -0.738013
8 -1.775043 -1.279997  0.194206 -1.176260
9 -0.602815  1.183396 -2.712422 -0.377118

I want to delete rows based on the conditions that:

Row with value of col 'one', 'two', or 'three' greater than 0; and value of col 'four' less than 0 should be deleted.

Then I tried to implement as follows:

df = df[df.one > 0 or df.two > 0 or df.three > 0 and df.four < 1]

However, resulting in a error message as follow:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Could someone help me on how to delete based on multiple conditions?

fyr91
  • 1,153
  • 4
  • 15
  • 32

1 Answers1

51

For reasons that aren't 100% clear to me, pandas plays nice with the bitwise logical operators | and &, but not the boolean ones or and and.

Try this instead:

df = df[(df.one > 0) | (df.two > 0) | (df.three > 0) & (df.four < 1)]
Brionius
  • 13,234
  • 3
  • 34
  • 47
  • 5
    You want `df = df[((df.one > 0) | (df.two > 0) | (df.three > 0)) & (df.four < 1)]` as to why it's because it's ambiguous to compare arrays as there are potentially multiple matches see this: http://stackoverflow.com/questions/10062954/valueerror-the-truth-value-of-an-array-with-more-than-one-element-is-ambiguous – EdChum Mar 12 '15 at 19:00
  • 1
    Oh, whoops, didn't see the `and` at the end. Edited. – Brionius Mar 12 '15 at 19:11
  • 1
    @Brionius: it's basically because `or` and `and` can't have their behaviour customized by a class. They do what they do based on the result of bool(the_object), and that's it. – DSM Mar 12 '15 at 19:30
  • To delete, say, any row with a string that contains 1 of 20 possible subkeys, [look here](http://stackoverflow.com/a/31663495/3491991) – zelusp Nov 15 '16 at 22:57