How can I generate probabilistic forecasts to do probabilistic classification?

Question

I have a collection of univariate, irregularly spaced, financial time series. Each series is labeled by its class. The image below shows some example data.

A note on the data:

The time series could be made evenly spaced by filling in missing values with 0. Based on the data generating process, zeros would make much more physical sense than values obtained via interpolation. I think that this is the most unique and important fact about my data.
The stationarity of the time series cannot be guaranteed.
Other properties of the time series cannot be guaranteed. For example, the time series may have steps and pulses or outliers.
I don't believe that my data satisfy the Markov Property because low current values would not necessarily lead me to predict low values in the future. I would need to consider the historical context.
The time series don't exist in their own worlds. Information from one time series will very likely provide information about another.
Value is in currency and may be treated as a continuous variable.
The data will come from users of software that is still in development. Right now, I am testing the waters with personal financial data.

The Task:
I will receive new, unlabeled data that will almost always be at future time points. For example, the new data might look like this:

+---------------------+-------+
|        Date         | Value |
+---------------------+-------+
| 2019-08-09T03:34:12 | 15.75 |
| 2019-08-05T16:22:24 | 4.72  |
| 2019-08-18T19:58:54 | 28.19 |
| 2019-08-14T04:03:47 | 16.44 |
+---------------------+-------+

What is an example of a way my new data can be classified (with accompanying class probabilities)?

My new data can be from any of the classes. For example, the 3rd row above could be from class W while the rest of the rows could be from class F. (But, because the new data is unlabeled, I will not know which class each row is from.) For each row of my new data and for each class, I would like to get the subjective probability that the given row came from that class.

My Thoughts:

I know that time series classification is a thing, but I'm not sure how or even if it could be used with my data. From what I have read, it sounds like time series classification works by matching up the series on a common time axis. I don't think this will work for me because my new unlabeled data comes from the future. Also, my new data cannot be assumed to be from the same class. I think that what I need to do (but am unsure how) is to make probabilistic forecasts for each time series. Then, I will be able to use these forecasts to do probabilistic classification of my new data.

Acceptable Solutions:

If your solution uses a model based upon assumptions that may not be met by my data, then that is acceptable. I can worry about choosing an optimal model later. I just need a general idea of how this may be done with formulas. Keywords of what to Google or examples of people doing similar classifications are also welcome.

It seems that each class here is its own time-series/sequence. So for new datapoints, you wish to assign which sequence it most likely belongs to? If you were to do this with human intelligence, how would you do it? Can you describe the reasoning or aspects that make a point more likely to belong to a particular sequence? — Jon Nordby, Mar 02 '22 at 20:18
Btw, your data would be a bit easier to understand from the plot if there were markers at each datapoint (especially the irregularity/sparsity concept). — Jon Nordby, Mar 02 '22 at 20:21
One approach would be to make one probabilistic model existing time series (per "class"). Could be ARIMA, VAR, Gaussian Process etc., Then score new points under each model, and assign the one which it is considered most likely — Jon Nordby, Mar 02 '22 at 20:24
@JonNordby Yes, each class here is its own time-series/sequence. Yes, for new data points, I wish to assign which sequence it most likely belongs to. If I were to do this using human intelligence, I would eyeball the mean value for each class and the frequency at which values are being reported and make a prediction from there. — Escherichia, Mar 03 '22 at 04:00
@JonNordby I appreciate the idea of a model for each time series. That gave me some ideas. But I also found this answer suggesting the use of hierarchical forecasting or the use of one big neural network. However, my classes are not necessarily hierarchical. I have grouped time series. But that is probably an extra layer of complexity that I can worry about later. — Escherichia, Mar 03 '22 at 04:33
@JonNordby What do you mean by "Then score new points under each model, and assign the one which it is considered most likely"? And how can I get a subjective probability out of those assignments? — Escherichia, Mar 03 '22 at 04:34
A probabilistic model will always be able to compute the probability of a given datapoint. Basically you just try each new datapoint with each model, and then assign the datapoint to the model which gave the highest probability. If only one model gave high probability, then the confidence is high. If several models give very similar probabilities, then confidence in the assignment is low — Jon Nordby, Mar 03 '22 at 15:00
You could ignore the time component and just treat it as a 1-d clustering problem. K-means will give you hard class assignments. Guassin mixture models (GMMs) will give you soft/probabilistic class assignments. I think GMMs do most of what you want. — ralph, Mar 18 '22 at 20:43
@ralph Thanks, but I suspect most of the discriminatory power between the classes to depend on the frequencies. I would not want to eliminate the time data. — Escherichia, Mar 18 '22 at 20:45
Perhaps if that's a good predictor, then include it in your clustering? For example, compute time in seconds/minutes etc between the latest data point and each of the previous data points. — ralph, Mar 18 '22 at 20:52
Based on what you've said, I think you also just come up with a set of 'qualitative' rules to determine where a new value belongs. An approach which includes as much domain knowledge as possible is the best. Let $x$ be a new value. Then check $x$ against several criteria to determine which class it best belongs. — ralph, Mar 18 '22 at 20:59

How can I generate probabilistic forecasts to do probabilistic classification?

0 Answers0