4

I want to create a new named column in a Pandas dataframe, insert first value into it, and then add another values to the same column:

Something like:

import pandas

df = pandas.DataFrame()
df['New column'].append('a')
df['New column'].append('b')
df['New column'].append('c')

etc.

How do I do that?

barciewicz
  • 2,865
  • 4
  • 25
  • 58
  • Possible duplicate of https://stackoverflow.com/questions/12555323/adding-new-column-to-existing-dataframe-in-python-pandas – r3zaxd1 Jul 24 '18 at 13:13

3 Answers3

6

If I understand correctly you want to append value to an existing column in a pandas dataframe, the thing is with DFs you need to maintain a matrix-like shape so number of rows is equal for each columns what you can do is add a column with a default value then update this value with

for index, row in df.iterrows(): df.at[index, 'new_column'] = new_value

amo3tasem
  • 110
  • 6
5

Dont do it, because it's slow:

  1. updating an empty frame a-single-row-at-a-time. I have seen this method used WAY too much. It is by far the slowest. It is probably common place (and reasonably fast for some python structures), but a DataFrame does a fair number of checks on indexing, so this will always be very slow to update a row at a time. Much better to create new structures and concat.

Better to create a list of data and create DataFrame by contructor:

vals = ['a','b','c']

df = pandas.DataFrame({'New column':vals})
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090
0

If in case you need to add random values to the newly created column, you could also use

df['new_column']= np.random.randint(1, 9, len(df))
myeongkil kim
  • 2,268
  • 4
  • 14
  • 20