I have a imbalanced dataset of 3k rows with 87:13 ratio of positive and negative classes. I am trying to do binary classification. Since my class proportion is skewed, I have optimize the decision threshold.
a) I have 3 independent features which are feat_a, feat_b and feat_c. While feat_a is studied heavily in literature and used in practice, we would like to see whether feat_b and feat_c add any value to our prediction model. So, I built a model (model a) with feat_a
using logistic regression. Now my objective is to build two more models as shown below
model b ~ feat_a, feat_b (will have two features)
model c ~ feat_a, feat_b, feat_c (will have three features)
Now my question is
a) How can I compare model a, model b, model c?
b) Since my dataset is imbalanced, I have to choose appropriate hyperparameters. Should this hyperparameters be same across all models because I want to compare them?
c) Should the decision threshold be same across all models? I know by default 0.5 is the threshold but since my dataset is imbalanced, it is important to optimize the threshold. Should I retain the same decision threshold across different models?
Can help me with this?