I'm in the process of training a NB model based on continuous features that need Equal Frequency Discretization to be used.
Now, the question mark I'm facing is if discretization needs to be performed
separately for train and score set
appending train and score set together
It comes naturally to me to go for the second approach, as the train and score set can have different distributions for each variable, which would cause different deciles to be generated and therefore different discretization results for the same variable values in the two datasets.
However, I have to admit I'm not encountering much material about the above topic so I would ask to the community if any knowledge is here to share or links to be consulted
bests