This is an extremely wide question. People write their Ph.D. theses on topics like these.
That said, the gold standard is always to use a holdout sample. Hold out the last (say) 20% of the time series, fit your models to the first 80%, forecast into the holdout period, and pick the model that performs best. In-sample fit is a notoriously bad indicator of true predictive performance.
However, as long as you stay within one class of models, you can use in-sample fit, penalized by the complexity of the model, in an information criterion framework. For instance, the automatic ARIMA and exponential smoothing method selectors in the forecast and fable packages for R choose models by minimizing the AIC. Note, however, that you can only compare different ARIMA or ETS models among each other this way - you can't compare the AIC value of an ARIMA with that of an ETS model. (A holdout approach of course does not have that limitation.)
Another possibility would be not to select models, but to average them. "Ensemble forecasting" is often better than picking a "best" model. You can in principle try to optimize weights, but this often does not improve accuracy. Yet another approach, especially if you have a large number of candidate models, is to shortlist your large number to "forecasting pools", then average the forecasts from these shortlisted methods.
Here are two threads that may be helpful: Strategies for time series forecasting for 2000 different products? and Ways to increase forecast accuracy.
In any case, I would be careful about the error measure. The Mean (Absolute) Percentage Error will reward you for forecasting low, and the optimal forecast under MAPE is lower than the expectation forecast: What are the shortcomings of the Mean Absolute Percentage Error (MAPE)? This may or may not be what you want or need. I would always prefer a scaled version of the (Root) Mean Squared Error, which is optimized by the unbiased expectation.
Finally, you might be interested in the International Institute of Forecasters, who have a yearly conference that attracts a large number of practitioners, with workshops (e.g., on "Forecasting to Meet Demand"), and publish a practitioner-oriented journal, Foresight: The International Journal of Applied Forecasting. You would be able to find a lot of people in this community that you could learn from. Full disclosure: I'm one of the people giving that workshop, and I'm a Deputy Editor for Foresight.