We have a model which predicts the start time of an event (lets call it predicted_start). We also have a default start time for an event (lets call it default_start), but it's usually not correct and that's why we made the model to predict a more correct start time.
The model is doing great, but sometimes it's wrong and predicted_start differs greatly from the actual start (lets call it actual_start). Also, sometimes it's right and predicted_start can differ greatly from default_start and still be correct.
It would be nice to know the probability of prediction_start being correct and close to actual_start. It's not a random guess and there has to be a probability distribution somewhere ... right? This validation would also probably be dependant on offset from defualt_start and maybe previous event's offset from defualt_start, not sure, maybe this doesn't need to be that complicated?
Can't really wrap my head around this and would greatly appreciate any pointers.
EDIT: I have considered logistic regression of some sort, but was hoping someone knew a better solution.
predicted_start = default_start if time < midnight else 3* default_start. if you predictactual startthen a prediction interval tells you what is the "plausible" range a newactual_startwill be in, taking into account the amount of data you have to estimate the parameters and the typical error size. https://en.wikipedia.org/wiki/Prediction_interval. – seanv507 Dec 27 '19 at 17:41