Poor classification even after oversampling minority class

Question

I want to predict mortality so minority class (dead=1) is important for me but my XGBoost model performing poorly for this class. In other words, the model performed the opposite of what I wanted.

the code:

df=pd.read_csv(data)
X=df.drop('label',axis=1)
y=df.label
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=27)
oversample = RandomOverSampler(sampling_strategy='minority')
X_train,y_train=oversample.fit_resample(X_train, y_train)
xgb=XGBClassifier()
param_grid = { 
    "max_depth": [6,10,4,8],
    "n_estimators": [50,200,100,500,1000,2000],
    "learning_rate": [0.1,0.2,0.3,0.4],
    "booster" :['gbtree']
    }
kfold = KFold(n_splits=10, shuffle=True, random_state=0) 
grid_search = GridSearchCV(estimator=xgb, 
                           param_grid=param_grid, 
                           scoring='recall', 
                           refit=True, 
                           n_jobs=-1, 
                           cv=kfold, 
                           verbose=0)
grid_result = grid_search.fit(X_train, y_train)
print(f'The best score is {grid_result.best_score_:.4f}')
print(f'The best hyperparameters are {grid_result.best_params_}')
grid_predict = grid_search.predict(X_test)# Get predicted probabilities
plot_confusion_matrix(grid_search,X_test,y_test)

Results:

optimum recall value is 0.86 and as far as I know, this belongs to the majority class (alive=0).

and for more details:

How can I improve ML metrics (e.g. recall) for minority class (dead=1)?

A number of threads you may find useful: https://stats.stackexchange.com/q/357466/1352 and https://stats.stackexchange.com/q/312780/1352 and https://stats.stackexchange.com/q/222179/1352. — Stephan Kolassa, Jul 12 '23 at 10:13
While editing your question I couldn't understand optimize recall value is 0.86 and as far as I know, this belongs to the majority class (alive=0) ... — utobi, Jul 12 '23 at 10:26
I mean with using grid search and best parameters recall value is 0.86. — Nima Yousefi, Jul 12 '23 at 11:22
but recall value for minority class is something around 0.46 (Same as precision and f1 score). As I mentioned, my aim is mortality prediction among people so this results is not good am I right? — Nima Yousefi, Jul 12 '23 at 11:32
Could you tell us more about your data? “Mortality prediction” sounds like survival analysis, hence there could be censoring involved, and you may be using the wrong tool for the job. — Tim, Aug 18 '23 at 05:15

Kevin · Accepted Answer · 2023-08-19T00:04:54.380

There are few things that could be possibly negatively affecting your model's performance:

Your confusion matrix shows a 120:49 negative:positive labels among your test data, which is relatively even for a class imbalance problem. It is possible that by oversampling you are introducing too much noise into your training data.
Instead, I would change your implementation of the 'cross-validator' object (cv=kfold in your code) to be a stratified K-fold. That way, your model is seeing an equal amount of each label within each iteration of the cross validation.
I would also add scale_pos_weight to your parameter search, as that will affect how the model learns from a positive vs negative label. For example, you may want to have scale_pos_weight>1 so that your model is more sensitive to predicting the minority class correctly.
Finally, if your code begins to take too long to run, you might be able to cut "n_estimators". [EDIT] It is possible for boosting methods to overfit with too many trees, as opposed to something like Random Forests, but this is rare. See ISL, page 347. As @seanv507 pointed out, you can also incorporate the early_stopping_rounds method to avoid any possible overfitting.

boosting will definitely overfit with more trees as opposed to random forest - that's why xgboost has a stopped training facility to stop adding trees when validation error stops decreasing — seanv507, Aug 18 '23 at 06:02
My apologies, @seanv507 is correct. I will edit the response. — Kevin, Aug 18 '23 at 23:58

Poor classification even after oversampling minority class

1 Answers1