Can't get AUC higher than 0.62 using XGBoost or Random forest

Asked Aug 24 '22 at 17:06

Active Aug 25 '22 at 10:42

Viewed 40 times

I have a dataframe with multiple numeric predictive features, and one binary target feature. 63% of the data belongs to class 0, and the rest to class 1. So the data is imbalanced but not severely.

I aim to build a classifier, using Random forests or xgboost.

I'm using cross validation with GridSearchCV, I tired with and without feature selection (in both algorithms), I tuned the hyperparameters as much as I could, I tried many different numbers for each parameter, like max_depth, min_samples_leaf and more.. using the grid_params argument.

I'm using roc_auc_score to evaluate the models, and the best outcome so far was AUC = 0.61 in both algorithms. What could be the reason for such poor results in your opinion?

EDIT:

I tried also Kernel SVM and logistic regression. They failed, with AUC = 0.6 and 0.55. Random forest and gradient boosting machines (and xgboost) were the best among those, altough they still produce poor results. Still can't figure out what improvements can I make.

edited Aug 25 '22 at 10:42

asked Aug 24 '22 at 17:06

CORy

what's the gap between tree-based methods and weaker learners like logistic reg? – John Madden Aug 24 '22 at 18:21
A logistic regression model produces AUC = 0.53, much worse. – CORy Aug 24 '22 at 21:03
Why should you achieve higher performance than that? – Dave Aug 25 '22 at 10:55
@Dave Just trying to improve it, the higher the better. I found my self at a dead end, which is why I'm here trying to find some answers. – Programming Noob Aug 25 '22 at 11:55

Can't get AUC higher than 0.62 using XGBoost or Random forest

EDIT:

0 Answers0