0

My machine learning model works very well on coco dataset, but poorly on frames captured from a camera. Therefore, I was trying to compare the mean and the variance of the coco dataset with those from the frames captured by the camera, to see if they differ a lot (and eventually try to fix the problem by normalizing the camera frames).

But I am struggling a bit computing the variance of the coco dataset. It is 17Gb big, so I can't load all the images at once, so I was trying to do it with the sum of square method: (solution come from here, with some adjustments: Fastest way to compute image dataset channel wise mean and standard deviation in Python) import numpy as np import cv2 from tqdm import tqdm import os

means = []
print("Means")

PATH = "./train2017/"

N = len(os.listdir(PATH))

for Nfile in tqdm(os.listdir(PATH)):
    img = cv2.imread(PATH+Nfile)
    val = np.reshape(img, -1)
    img_mean = np.mean(val)
    means.append(img_mean)
means= np.array(means)
mean = np.mean(means)
print(mean)

global_mean = mean #112.5834416723132

# Here variance is computed
# just get a per-pixel array with the vals for (x_i - mu) ** 2 / |x|
sums = 0

for Nfile in tqdm(os.listdir(PATH)):
    img = cv2.imread(PATH+Nfile)
    print(img.shape)
    sums = sums + ((img - global_mean) ** 2) / N
    print(sums.shape)
# Get mean of all per-pixel variances, and then take sqrt to get std
dataset_std = np.sqrt(np.mean(sums))

print(dataset_std)

However, it fails because images in the coco dataset have different shapes, so I get broadcast error at the line:

sums = sums + ((img - global_mean) ** 2) / N

Because the old sums has the shape of the of the first image, while img has a different shape.

Does anyone have a suggestion how to fix this to keep it mathematically correct?

desertnaut
  • 52,940
  • 19
  • 125
  • 157

0 Answers0