1

I need to add new columns to a dataframe by transposing rows of another dataframe.

Here is my DF1:

ID1  Value1  Value2
11   xxx      yyy
12   aaa      bbb

Here is DF2:

Index ID2   KEY     VALUE
1     123   ID1     11
2     123   FOLDER  folder1
3     123   FILE    file1
4     456   ID1     12
5     456   FOLDER  folder2
6     456   FILE    file2
7     456   FOLDER  folder3
8     456   FILE    file3

Note that ID2 456 has duplicate KEY's.

I want to ignore all duplicate keys and pick the values for last KEY's (with highest value of Index).

What I want:

  1. Look for the KEY = ID1 in DF2.
  2. Transpose all the remaining keys with identical ID2

Output:

ID1  Value1  Value2  FOLDER   FILE
11   xxx      yyy    folder1  file1
12   aaa      bbb    folder3  file3
richardec
  • 14,202
  • 6
  • 23
  • 49
ShitalSavekar
  • 289
  • 2
  • 4
  • 10

1 Answers1

2

You can first do a pivot, then a merge:

new_df = df1.merge(df2.drop_duplicates(keep='last', subset=['ID2', 'KEY']).pivot(index='ID2', columns='KEY', values='VALUE').astype({'ID2': int}))

Output:

>>> new_df
   ID1 Value1 Value2   FILE   FOLDER
0   11    xxx    yyy  file1  folder1
1   12    aaa    bbb  file3  folder3
richardec
  • 14,202
  • 6
  • 23
  • 49