12

I have two tables with same number of rows (second table is computed from first one by processing of text inside T1). I have both of them stored as pandas dataframe. T2 is no common column with T1. This is example because my tables are huge:

T1:
| name  | street  | city   |
|-------|---------|--------|
| David | street1 | Prague |
| John  | street2 | Berlin |
| Joe   | street3 | London |

T2:
| computed1 | computed2 |
|-----------|-----------|
| 0.5       | 0.3       |
| 0.2       | 0.8       |
| 0.1       | 0.6       |

Merged:
| name  | street  | city   | computed1 | computed2 |
|-------|---------|--------|-----------|-----------|
| David | street1 | Prague | 0.5       | 0.3       |
| John  | street2 | Berlin | 0.2       | 0.8       |
| Joe   | street3 | London | 0.1       | 0.6       |

I tried these commands:

pd.concat([T1,T2])
pd.merge([T1,T2])
result=T1.join(T1)

With concat and merge I will get only first thousand combined and rest is filled with nan (I double checked that both are same size), and with .join it not combine them because there is nothing in common.

Is there any way how to combine these two tables in pandas?

Thanks

stanedav
  • 397
  • 1
  • 4
  • 12
  • maybe this SO question can help you? : [link](https://stackoverflow.com/questions/12555323/adding-new-column-to-existing-dataframe-in-python-pandas) –  Dec 05 '17 at 13:56
  • There is same size of both `DataFrames`? Maybe need `pd.concat([T1,T2], axis=1)` I guess – jezrael Dec 05 '17 at 13:57
  • 2
    Or if different indices but same length of both DataFrames need `pd.concat([T1.reset_index(drop=True),T2.reset_index(drop=True)], axis=1)` – jezrael Dec 05 '17 at 13:59
  • thank you guys, @jezrael solution with reset index seems to works! Thanks all! – stanedav Dec 05 '17 at 14:06

3 Answers3

25

You need reset_index() before concat for default indices:

df = pd.concat([T1.reset_index(drop=True),T2.reset_index(drop=Tru‌​e)], axis=1)
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090
6

I want to add that pd.concat can do what you want by just providing the axis as columns. like this:

pd.concat([T1,T2],axis=1)
MKJ
  • 459
  • 6
  • 18
0

Another way would be to merge on the index values:

df = T1.reset_index().merge(T2.reset_index(), left_index=True, right_index=True, how='left)
dubbbdan
  • 2,466
  • 1
  • 20
  • 37