I have been trying to create a new column in one big data frame based on the comparison of a column ('Pen') of another data frame. The corresponding value should be repeated n times in df2, observe that pen repetition does not have a pattern in df2, my code does not work out.
import numpy as np
import pandas as pd
df = pd.DataFrame([[1, 2, 4], [2, 6, 1], [3, 4, 2], [4, 3, 9]], columns = ["Pen", "A", 'B'])
df2 = pd.DataFrame([[1, 2], [1, 6], [1, 4], [2, 3], [2, 7], [2, 1], [3, 2], [3, 7], [3, 3], [4, 2], [4, 12], [4, 5]], columns = ["Pen", "A"])
The code:
df2['B'] = np.where(df2['Pen'] == df['Pen'], df['B'], 0)
#the expectation is:
| PEN | A | B |
|---|---|---|
| 1 | 2 | 4 |
| 1 | 6 | 4 |
| 1 | 4 | 4 |
| 2 | 3 | 1 |
| 2 | 7 | 1 |
| 3 | 2 | 2 |
| 3 | 7 | 2 |
| 3 | 3 | 2 |
| 4 | 12 | 9 |
| 4 | 5 | 9 |
| 4 | 5 | 9 |