Machine learning is nice, but often not applicable – this looks like a case where you can and should write “ordinary” code.
What is ML?
Machine learning is just statistics. After “learning” the relationships of some training data (actually, fitting a statistical model to the training data), the ML algorithm can predict outputs for new inputs. For supervised learning, the training set contains inputs and known outputs (“labels” in case of a classification problem). For unsupervised learning, the training set is unlabelled and the algorithm must infer relationships, which is closely related to clustering problems.
Limitations of ML
ML/computational statistics can be incredibly cool, but there are notable problems:
To obtain a good model, you need a big set of training data. Obtaining this set may be expensive and difficult.
Garbage-in, garbage-out: if your training data is bad, the model will be bad and result in bad predictions. You have to validate the model to test its quality. Suitable validation requires some amount of statistical knowledge.
Statistical models contain specific assumptions. If these assumptions don't fit your use case, the model will be bad. As a simple example, consider trying to fit a linear regression model on a periodic data set. The model's assumption of a linear relationship does not hold, so the model will be useless.
Generalization error: ML models try to generalize from the training data. That involves guessing, and guessing can go wrong. For example, if your training data is not a representative sample of the inputs that will be observed later, you might get a biased model.
Predictions will be fuzzy and inexact (have some variance). You can reduce this with larger training sets, but many real problems contain unavoidable noise. So the outputs of an ML algorithm have to be interpreted carefully.
E.g. the result of an image classification algorithm can be communicated misleadingly as “the image shows a cat”, or more clearly as “the image might show a cat (42% likelihood), toaster (41%), or computer screen (39%)”.
Similarly, for regression problems providing a credible interval might be helpful. There's a difference between a prediction “this customer is going to spend $29.21 today” and “there's a 50% chance the customer will spend between $19.39 and $64.22 today”.
Interpretability: A trained model usually has no meaningful interpretation. In simple cases a model describes correlations between input features, which can be interpreted and visualized. But simulation-based algorithms or models with latent variables are notoriously tricky to interpret and debug. It is not generally possible to explain within the problem domain why a specific prediction was made. This can have ethical and legal ramifications.
When to use ML
For what kinds of problems can ML/computation statistics be appropriate?
For example, if an exact solution is infeasible and an approximate solution is tolerable. The ML model must be allowed to be wrong. Your requirements tell you how wrong the model is allowed to be. You can then try to meet the necessary prediction performance, e.g. by better and bigger training sets, or by techniques such as boosting.
In particular, approximate solutions are tolerable if they are merely used to advise human experts, or when any actions triggered by the prediction are reversible. E.g. using ML for email spam categorization is fairly unproblematic because I can manually mark emails as spam/not spam if the categorization is wrong.
So how about those if-statements?
For a rule engine or other core business logic, machine learning is probably not a good fit.
- The ML model may perform unwanted actions.
- The ML model may fail to perform wanted actions.
- The ML model is basically impossible to debug.
- The necessary training set to achieve satisfactory performance is going to be much larger than a comprehensive test suite.
Writing software can be difficult, and requirements can be complex. Machine learning can sometimes fulfill these requirements, but it will not magically remove that complexity.
- At best, you can approximate a good-enough solution.
- At worst, you are completely ignoring your requirements.
ML is just a mathematical toolset, and no replacement for gathering requirements, writing code, performing tests. You still have to do software engineering.