So I'm trying to achieve the following:
df1: contains information related to a specific instance and a specific ID:
| ID | Datetime (dd-mm-yy hh:mm) |
|---|---|
| 111 | 01-01-22 08:30 |
| 111 | 02-01-22 12:45 |
| 111 | 03-01-22 19:10 |
| 222 | 01-01-22 10:35 |
| 222 | 02-01-22 11:10 |
| 222 | 03-01-22 18:00 |
| 333 | 02-01-22 14:00 |
df2: contains values with specific instances:
| ID | Datetime 2 (dd-mm-yy hh:mm) | Relevant value |
|---|---|---|
| 111 | 29-12-21 12:00 | 3.2 |
| 111 | 01-01-22 10:00 | 3.8 |
| 111 | 01-01-22 11:00 | 4.1 |
| 111 | 02-01-22 12:00 | 4.6 |
| 111 | 03-01-22 17:00 | 4.8 |
| 222 | 01-01-22 12:00 | 3.5 |
| 222 | 03-01-22 14:00 | 4.2 |
| 333 | 01-01-22 10:00 | 3.6 |
| 333 | 02-01-22 15:00 | 4.9 |
df3 (result): should contain df1 + the closest value in the past for each instance and each ID.
| ID | Datetime 1 (dd-mm-yy hh:mm) | Datetime 2 (dd-mm-yy hh:mm) | Relevant value |
|---|---|---|---|
| 111 | 01-01-22 08:30 | 29-12-21 12:00 | 3.2 |
| 111 | 02-01-22 12:45 | 02-01-22 12:00 | 4.6 |
| 111 | 03-01-22 19:10 | 03-01-22 17:00 | 4.8 |
| 222 | 01-01-22 10:35 | Na | Na |
| 222 | 02-01-22 11:10 | 01-01-22 12:00 | 3.5 |
| 222 | 03-01-22 18:00 | 03-01-22 14:00 | 4.2 |
| 333 | 02-01-22 14:00 | 01-01-22 10:00 | 4.9 |
I would greatly appreciate your help!
I have tried a loop, but it takes too long as I'm working with a large dataset. I would really prefer doing this within Pandas