1

I am currently working on demand forecasting. During my research online I came to know about methods used to classify demand which helps us to focus on series which have better forecasting ability etc. So, the classification of demand happens mainly based on Coefficient of Variation (CoV), Average Demand Interval (ADI). This leads us to analysis like ABC XYZ segmentation and demand classification like - Intermittent, Lumpy, erratic, Smooth etc.

does all the above approaches work makes sense only when they have fixed thresholds like I often see in online are fixed and they all use the same threshold cut-off (at least based on my exposure of articles online). You can refer here and here

1.Smooth demand (ADI < 1.32 and CV² < 0.49) 2.Intermittent demand (ADI >= 1.32 and CV² < 0.49) 3.Erratic demand (ADI < 1.32 and CV² >= 0.49) 4.Lumpy demand (ADI >= 1.32 and CV² >= 0.49)

So, my question,

a) should these thresholds be alerted to reflect our dataset? For ex: Can I run a 1D-Kmeans clustering on CV and identify the natural breaks in my data to come up with XYZ segmentation or demand classification ADI or CV**2? For ex - I get 0.95, 2.6 and 3.31 as CoV value limit for X, Y and Z. Average CoV is 1.67. Is this right thing to do or should I just stick to fixed limits given online?

b) Should we compute average sales only by considering the active period (non-zero sales time periods) or full time period (using time periods when customer was inactive)

The Great
  • 3,272
  • Sorry it took me a while to get to this... – Stephan Kolassa Jan 17 '24 at 16:09
  • I am grateful for your help and for sharing knowledge with the community – The Great Jan 19 '24 at 01:41
  • @StephanKolassa - I have read some of your academic articles/books online, is it possible that you can help me with academic related question - https://academia.stackexchange.com/questions/206070/semiconductor-industry-specific-journals-for-applied-data-science-low-to-med-i – The Great Jan 20 '24 at 14:30

1 Answers1

1

These limits are absolutely not cast in stone. They go back to work by Aris Syntetos and John Boylan about 15 years ago on classifying demand as intermittent, lumpy or continuous, and rely essentially on one or very few datasets. There is no particular reason why these specific limits should work well for new datasets - and what "working well" means is already something one can argue about. You do find time series that are classified as "intermittent" based on these limits, but that anyone eyeballing them would classify as "lumpy".

So by all means, go ahead and choose other thresholds. Clustering is one way. Another one is to just play around with possible limits until you get a "good" result. Keep in mind that this classification is just a means to an end, so be mindful of what a "good" result is.

Average sales, in this classification, are calculated across all time periods. (Cutting off spurious zeros at the beginning when the product is not available, of course.) If you remove zeros, then you can't very well classify the series as intermittent or lumpy any more.

And that already leads us to a shortcoming of this simple classification: it will happily classify a series as intermittent or lumpy that absolutely isn't, because there are patterns to zeros, patterns which this method does not even look at. Here is an example of "intermittent" data (because it has "many zeros"), which is actually seasonal, not intermittent: Explain the croston method of R

So don't over-rely on this classification, but keep other possible features like seasonality in mind. For instance, Talagala et al. (2022) and literature cited therein give a nice intro into the entire stream of leveraging time series features.

Stephan Kolassa
  • 123,354