I am trying to predict ambulance demand for the next hour, for a city area in the USA, based on previous demand, weather, large people gatherings, and similar spatio-temporal factors - using Machine Learning.
I modelled the problem by binning time into 1-hour bins, as those short time horizon predictions should provide most value. I would provide these predictions to potential users in hourly intervals. I tried both timeseries and regression approaches, but as I am only predicting one value in the future, regression is much easier to implement, and reacts better to sudden changes in features, while timeseries models produce smooth prediction curves which don't help me.
But by binning demand into hourly bins, the target variable is very spiky, and has sudden jumps between values 0, 1, 3, 0, etc - and is very hard to predict. Apart from being hard to predict, the binning has an effect that would be bad in real-world applications. Let's say I have the following emergencies: 12:45 - 1 ambulance 14:01 - 3 ambulances 14:45 - 1 ambulance
For hourly bins [12:00, 13:00), [13:00, 14:00), [14:00, 15:00), the demand would be: 1, 0, 4. BUT, 3 of the 4 ambulances in [14:00, 15:00) were very near 13:00. So if I have some features at 13:00 that would reflect this increase in demand, the model would get confused, as the demand is happening in the next bin.
My first question is: is my approach correct, and is there a simple way to fix the problem I'm having? (demand close to bin limits)
After talking to some colleagues at University, they have recommended that I try smoothing out the demand curve using moving averages. Thus, I could 'distribute' some of the demand of 3 at 14:01 to its neighbouring bin, [12:00, 13:00).
The approach would be:
- Resample raw data about emergencies into 1-minute bins
- Apply a moving average (rolling mean) of W window size, let's say W=60
- Shift the results by -30 ( =-W/2 ), so the demand is distributed to the left and right to the minute when it was happening.
I'm not very experienced, so I have a simple question: Does it make sense to smoothen out the target demand curve? If not MA, could exponential MA help? I found one post, Use of smoothed target variable when evaluating forecasting performance, that says that predicting smoothened demand doesn't translate to predicting regular demand. But if my project ever hits the shelves, potential customers would have use of my model - because, if I do predict higher values than 0 for [13:00, 14:00), based on smoothened demand, I would still get some value from the predictions, as the predicted demand would happen just a minute later - at 14:01.
My last question is about the shift. When creating my training dataset, it makes sense to distribute the demand in a minute to the left and right, in order to mitigate the binning effects. If the W=60 is too large, I can also lower it to W=5 - thus, only the demand that happens in interval 14:00-14:04 would be distributed to the [13:00, 14:00) bin.
But when I'm creating samples for inference, it seems that I would lose some data. If I'm predicting for [14:00, 15:00), and there was demand of 3 in 13:55, then the demand would be distributed to the left bin, and to the right bin. But, because I don't keep track of the target variable in inference samples (it's what I'm trying to predict), the part that was supposed to be distributed to the right would be lost. Or, I could potentially keep track of it as another feature. Does the shift make sense?
Sorry for the long post, but I lack rigorous training and would really need some help in validating these ideas. Even though some of them may not be methodologically correct, I ask of you to also look at the practical applications of my model. I already managed to follow the demand trend and produce smooth prediction curves. Now I'm more interested in predicting sudden increases and producing alerts of high demand.