0

We want to determine whether a data point belongs to the class + or −. We train a classifier using binary logistic regression. Each data point has two features, f1 and f2, which can take the values 1 or 0. For a certain set of parameters θ, the confusion matrix looks like this: \begin{array}{ccc} & \text{h(x)} & \\ \text{y} & + & - \\ \hline + & 250 & 1000 \\ - & 750 & 8000 \\ \end{array} If we change θ, we can obtain a different confusion matrix. How many matrices are possible for this dataset where each data point is represented by the features f1 and f2? I am thinking that there would be 10 different matrices, but I am not sure.

Case 1: $\theta_0$ is so negatively large that all datapoints are classified as negative regardless of x.

Case 2: $\theta_0$ is so positively large that all datapoints are classified as positive regardless of x.

Case 3: $\theta_1$ is larger than $\theta_0$, and $\theta_0 > 0$.

Case 4: $\theta_1$ is larger than $\theta_0$, and $\theta_0 < 0$.

Case 5: $\theta_2$ is larger than $\theta_0$, and $\theta_0 > 0$.

Case 6: $\theta_2$ is larger than $\theta_0$, and $\theta_0 < 0$.

Case 7: $\theta_2$ is larger than $|\theta_0 + \theta_1|$, and $\theta_2 > 0$.

Case 8: $\theta_2$ is larger than $|\theta_0 + \theta_1|$, and $\theta_2 < 0$.

Case 9: $\theta_1$ is larger than $|\theta_0 + \theta_2|$, and $\theta_1 > 0$.

Case 10: $\theta_1$ is larger than $|\theta_0 + \theta_2|$, and $\theta_1 < 0$.

Is this correct logic? I am having a little bit trouble which combinations would lead to different confusion matrices, and which ones would lead to the same, since it all depends on $\theta$.

  • 1
    What question you are trying to answer with the confusion matrix? Why are you doing this theoretical exercise? In a real analysis you either have: 1 matrix if split-sample validation is used, K matrices for K-fold cross-validation, N matrices for bootstrap, where K is the number of folds and N is the number of bootstraps. – The Doctor Jan 13 '24 at 10:31
  • 1
    Logistic regressions make no classifications on their own. Do you want to adjust the threshold (as is discussed in the link) to yield different classifications and confusion matrices? – Dave Jan 13 '24 at 23:35
  • 1
    Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community Jan 14 '24 at 15:53

0 Answers0