0

So I'm trying to achieve the following:

df1: contains information related to a specific instance and a specific ID:

ID Datetime (dd-mm-yy hh:mm)
111 01-01-22 08:30
111 02-01-22 12:45
111 03-01-22 19:10
222 01-01-22 10:35
222 02-01-22 11:10
222 03-01-22 18:00
333 02-01-22 14:00

df2: contains values with specific instances:

ID Datetime 2 (dd-mm-yy hh:mm) Relevant value
111 29-12-21 12:00 3.2
111 01-01-22 10:00 3.8
111 01-01-22 11:00 4.1
111 02-01-22 12:00 4.6
111 03-01-22 17:00 4.8
222 01-01-22 12:00 3.5
222 03-01-22 14:00 4.2
333 01-01-22 10:00 3.6
333 02-01-22 15:00 4.9

df3 (result): should contain df1 + the closest value in the past for each instance and each ID.

ID Datetime 1 (dd-mm-yy hh:mm) Datetime 2 (dd-mm-yy hh:mm) Relevant value
111 01-01-22 08:30 29-12-21 12:00 3.2
111 02-01-22 12:45 02-01-22 12:00 4.6
111 03-01-22 19:10 03-01-22 17:00 4.8
222 01-01-22 10:35 Na Na
222 02-01-22 11:10 01-01-22 12:00 3.5
222 03-01-22 18:00 03-01-22 14:00 4.2
333 02-01-22 14:00 01-01-22 10:00 4.9

I would greatly appreciate your help!

I have tried a loop, but it takes too long as I'm working with a large dataset. I would really prefer doing this within Pandas

Sam
  • 1
  • 1
  • 2
    IIUC, you need to use [`merge_asof`](https://pandas.pydata.org/docs/reference/api/pandas.merge_asof.html) – mozway Apr 06 '22 at 08:42

0 Answers0