In a data science project, data is typically preprocessed. We also build, test, and select different models. Models also come with their own preprocessing requirements that can vary greatly from model to model, e.g., some models require scaling, while others don't care.
What is considered best practice when managing the preprocessing and transformation of a dataset (or datasets) that will feed multiple distinct models, each with its own preprocessing requirements?
I'm wanting to know how to make preprocessing flexible enough to support multiple models, while also making it easy to manage change.
I recently took to using the cookiecutter data science project which advocates using an interim dataset. Presumably, this interim dataset forms a base set from which model-specific preprocessing is built off. This is one approach but would like to know what is considered best practice.