0

As I was reading this question on another thread: Why normalize images by subtracting dataset's image mean, instead of the current image mean in deep learning? I realize that either one point is not covered or I am missing something. In lollercoaster's answer, they talk about whether or not we should use the whole dataset for estimating the mean and variance for z-score normalization.

Let's say you're building a model that classifies images taken by a mobile phone. You gather as many samples as possible for training and compute the mean and variance on these for normalizing your inputs.

Now, let's consider the case where you ship that model in production and is used on a wide variety of camera qualities. You still use the mean and variance computed on the dataset to perform inferences on the user side, how can that work properly?

  • Welcome to Cross Validated! What issues do you see with using that mean and variance? – Dave Jun 11 '22 at 14:31
  • @Dave This is what I try to explain in the post: you compute the mean and variance on your dataset, but how does it react to out-of-distribution samples in production? Imagine you use image net and fine-tune with your own dataset but ship an image classification model to various phone and webcam qualities? – HyperToken Jun 13 '22 at 08:11

0 Answers0