I have a two dataframes of the form:
| customerId | sessionStart | sessionEnd |
|---|---|---|
| 1 | 2022-01-01 | 2022-01-19 |
| 1 | 2022-02-16 | 2022-03-01 |
| 2 | 2022-01-14 | 2022-02-01 |
| ... | ... | ... |
And a dataframe of the form
| customerId | eventTime | eventId |
|---|---|---|
| 1 | 2022-01-03 | A |
| 1 | 2022-02-02 | A |
| 2 | 2022-01-18 | A |
| ... | ... | ... |
And I'm trying to generate a dataframe that answers whether a given session (defined as the time interval in the first dataframe) contains at least one occurrence of an event (matched on customerId).
The dataframes are fairly large (1m+ rows) so not keen to do an O(n^2) nested for-loop.
Any ideas?
I've tried pd.merge_asof but I must not have the syntax right.