While training a linear regression model, it is advised to standardize the input features using the mean and std deviation of the train input features.
x = (x - x_mean)/ x_std_dev
However, when we want to infer the model on some examples, do we use the same mean and std deviation used during training or the mean and std deviation of the batch of test examples we feed the model? Which one is better and why?
x = (x - x_mean_train) / x_std_dev_train
vs
x = (x - x_mean_test_batch) / x_std_dev_batch