3

I have a simple pandas data frame.

import pandas as pd    
x = [5, 10, 20, 30, 5, 10, 20, 30, 5, 10, 20, 30]
y = [100, 100, 200, 200, 300, 300, 400, 400, 500, 500, 600, 600]
users =['mark', 'mark', 'mark', 'rachel', 'rachel', 'rachel', 'jeff', 'jeff', 'jeff', 'lauren', 'lauren', 'lauren']

df = pd.DataFrame(dict(x=x, y=y, users=users)

I want to keep certain rows of the data frame. Let's say all "rachels" and "jeffs". I tried df.query:

df=df.query('users=="rachel"' or 'users=="jeff"')

The result is a data frame only with users=="rachel". Is there a way to combine queries?

Rachel
  • 1,767
  • 5
  • 26
  • 55
  • `df.query('(users=="rachel") or (users=="jeff")')` or even `df.query('users=="rachel" or users=="jeff"')` will do the trick. Tested with `pandas==1.2.4`. – banderlog013 Dec 24 '21 at 11:13

2 Answers2

14

The standard way would be to use the bitwise or operator |. For a clear explanation of why, I'd suggest checking out this answer. You also need to use parentheses around each condition due to Python's order of evaluation.

df[(df.users == 'rachel') | (df.users == 'jeff')]
    users   x    y
3  rachel  30  200
4  rachel   5  300
5  rachel  10  300
6    jeff  20  400
7    jeff  30  400
8    jeff   5  500

Using query, you can still just use the or operator:

df.query("users=='rachel' | users=='jeff'")
    users   x    y
3  rachel  30  200
4  rachel   5  300
5  rachel  10  300
6    jeff  20  400
7    jeff  30  400
8    jeff   5  500
Community
  • 1
  • 1
Nick Becker
  • 3,439
  • 13
  • 18
  • Nice! Will mark as answered as soon as possible! Thank you! – Rachel Jan 04 '17 at 16:30
  • No worries. @EdChum's comment is also a simple solution. – Nick Becker Jan 04 '17 at 16:31
  • How would you create logic to show only results where name is either rachel or jeff, AND hometown was Chicago? So all rachels from Chicago, and all jeffs from Chicago, but not steves from chicago, or rachels from Atlanta. Could you use "users =='rachel' | users=='jeff' & hometown=='chicago'", or would the AND only apply to the jeffs, and you need to include the " & hometown=='Chicago'" to both sides of the OR? – Korzak Jan 24 '18 at 18:54
1

another way is :

df=df.query('users=="rachel"').append(df.query('users=="jeff"'))
Julien Marrec
  • 10,632
  • 4
  • 40
  • 61
Mahesh
  • 149
  • 8