1

How to select all columns that have header names starting with "durations" or "shape"? (instead of defining a long list of column names). I need to select these columns and substitute blank fields by 0.

column_names = ['durations.blockMinutes_x',
                'durations.scheduledBlockMinutes_y']
data[column_names] = data[column_names].fillna(0)
Klausos Klausos
  • 14,142
  • 48
  • 129
  • 212

4 Answers4

0

You could use str methods of dataframe startwith:

df = data[data.columns[data.columns.str.startwith('durations') | data.columns.str.startwith('so')]]
df.fillna(0)

Or you could use contains method:

df = data.iloc[:, data.columns.str.contains('durations.*'|'shape.*') ]
df.fillna(0)
Anton Protopopov
  • 27,206
  • 10
  • 83
  • 90
0

Use my_dataframe.columns.values.tolist() to get the column names (based on Get list from pandas DataFrame column headers):

column_names = [x for x in data.columns.values.tolist() if x.startswith("durations") or x.startswith("shape")]
Community
  • 1
  • 1
Mad Physicist
  • 95,415
  • 23
  • 151
  • 231
0

I would use the select method:

df.select(lambda c: c.startwith('durations') or c.startswith('shape'), axis=1)

Paul H
  • 59,172
  • 18
  • 144
  • 130
0

A simple and easy way

data[data.filter(regex='durations|shape').columns].fillna(0)

Sample Screenshot

enter image description here

Nursnaaz
  • 1,844
  • 19
  • 26