1

In the following DataFrame I need to search for all strings in 'a'.

df = pd.DataFrame({'id' : [1,2,3,4],
                'path'  : ["p1,p2,p3,p4","p1,p2,p1","p1,p5,p5,p7","p1,p2,p3,p3"]})

Need to check whether both 'p1' and 'p2' available.

a = ['p1','p2']

Something like following

if all(x in df.path for x in a):
    print df
Nilani Algiriyage
  • 28,024
  • 31
  • 81
  • 119

1 Answers1

1

How about this?

import pandas as pd

df = pd.DataFrame({'id': [1,2,3,4],
       'path': ["p1,p2,p3,p4","p1,p2,p1","p1,p5,p5,p7","p1,p2,p3,p3"]})

a = [ 'p1', 'p2']

# see: http://stackoverflow.com/a/470602/1407427
reg_exp = ''.join(['(?=.*%s)' % (i) for i in a])

# alternatively: print df.path.str.match(reg_exp, as_indexer=True)
print df.path.str.contains(reg_exp)

And the result:

0     True
1     True
2    False
3     True
Name: path, dtype: bool
Wojciech Walczak
  • 3,130
  • 2
  • 23
  • 23