1

I have a dataframe like this:

    NUMBER          NAME
1   000231          John Stockton
2   009456          Karl Malone
3   100000901       John Stockton
4   100008496       Karl Malone

I want to obtain a new dataframe with:

    NAME            VALUE1  VALUE2
1   John Stockton   000231  100000901
2   Karl Malone     009456  100008496

I think I should use pd.groupby(), but I have no function to pass as an aggregator (I don't need to compute any mean(), min(), or max() value). If I just use pd.groupby() without any aggregator, I get:

In[1]: pd.DataFrame(df.groupby(['NAME']))

Out[1]: 
           0               1
        0  John Stockton   NAME NUMBER 000231 100000901
        1  Karl Malone     NAME NUMBER 009456 100008496

What am I doing wrong? Do I need to pivot the dataframe?

Mark Rotteveel
  • 90,369
  • 161
  • 124
  • 175
Zizzipupp
  • 1,233
  • 10
  • 21

1 Answers1

1

Actually, you need a bit mode complicated pipeline:

(df.assign(group=df.groupby('NAME').cumcount().add(1)
   .pivot(index='NAME', columns='group', values='NUMBER')
   .rename_axis(None, axis=1)
   .add_prefix('VALUE')
   .reset_index()
)

output:

            NAME  VALUE1     VALUE2
0  John Stockton     231  100000901
1    Karl Malone    9456  100008496
mozway
  • 81,317
  • 8
  • 19
  • 49