My machine learning model works very well on coco dataset, but poorly on frames captured from a camera. Therefore, I was trying to compare the mean and the variance of the coco dataset with those from the frames captured by the camera, to see if they differ a lot (and eventually try to fix the problem by normalizing the camera frames).
But I am struggling a bit computing the variance of the coco dataset. It is 17Gb big, so I can't load all the images at once, so I was trying to do it with the sum of square method: (solution come from here, with some adjustments: Fastest way to compute image dataset channel wise mean and standard deviation in Python) import numpy as np import cv2 from tqdm import tqdm import os
means = []
print("Means")
PATH = "./train2017/"
N = len(os.listdir(PATH))
for Nfile in tqdm(os.listdir(PATH)):
img = cv2.imread(PATH+Nfile)
val = np.reshape(img, -1)
img_mean = np.mean(val)
means.append(img_mean)
means= np.array(means)
mean = np.mean(means)
print(mean)
global_mean = mean #112.5834416723132
# Here variance is computed
# just get a per-pixel array with the vals for (x_i - mu) ** 2 / |x|
sums = 0
for Nfile in tqdm(os.listdir(PATH)):
img = cv2.imread(PATH+Nfile)
print(img.shape)
sums = sums + ((img - global_mean) ** 2) / N
print(sums.shape)
# Get mean of all per-pixel variances, and then take sqrt to get std
dataset_std = np.sqrt(np.mean(sums))
print(dataset_std)
However, it fails because images in the coco dataset have different shapes, so I get broadcast error at the line:
sums = sums + ((img - global_mean) ** 2) / N
Because the old sums has the shape of the of the first image, while img has a different shape.
Does anyone have a suggestion how to fix this to keep it mathematically correct?