0

so i can do something like:

data = df[ df['Proposal'] != 'C000' ]

to remove all Proposals with string C000, but how can i do something like:

data = df[ df['Proposal'] not in ['C000','C0001' ]

to remove all proposals that match either C000 or C0001 (etc. etc.)

rafaelc
  • 52,436
  • 15
  • 51
  • 78
yee379
  • 5,876
  • 9
  • 50
  • 93

2 Answers2

1

You can try this,

df = df.drop(df[df['Proposal'].isin(['C000','C0001'])].index)

Or to select the required ones,

df = df[~df['Proposal'].isin(['C000','C0001'])]
E. Zeytinci
  • 2,502
  • 1
  • 15
  • 35
0
import numpy as np
data = df.loc[np.logical_not(df['Proposal'].isin({'C000','C0001'})), :]
# or
data = df.loc[              ~df['Proposal'].isin({'C000','C0001'}) , :]
S.V
  • 1,547
  • 1
  • 12
  • 29
  • Can you explain how your answer works? – rassar Dec 07 '18 at 22:07
  • `isin` checks if values of the Series is in some set (aka 'in'), `np.logical_not` or `~` negate it (aka 'not in'), and `loc` selects rows of a DataFrame using boolean array. – S.V Dec 07 '18 at 22:15