For simplicity's sake, let's suppose a binary classification problem, with a perfect 50% of probability for each of the classes, and a SkLearn's SVC model.
Let's assume that the classes in the vector Y cannot be predicted out of the values from the matrix X, because they are unrelated, but we don't know that. So we continue collecting observations, hoping that good predictions will be able to be made at some point.
While observations tend to infinite, should probability of the classes (obtained from predict_proba) converge towards that 50%?
If probabilities doesn't converge, does it mean that may be some form of correlation?
Any thoughts are well appreciated. Thanks in advance.
predict probawould loook like if it's not? – Juan Flautista De Torrepacheco Sep 07 '23 at 18:35