I have a dataframe with multiple numeric predictive features, and one binary target feature. 63% of the data belongs to class 0, and the rest to class 1. So the data is imbalanced but not severely.
I aim to build a classifier, using Random forests or xgboost.
I'm using cross validation with GridSearchCV, I tired with and without feature selection (in both algorithms), I tuned the hyperparameters as much as I could, I tried many different numbers for each parameter, like max_depth, min_samples_leaf and more.. using the grid_params argument.
I'm using roc_auc_score to evaluate the models, and the best outcome so far was AUC = 0.61 in both algorithms. What could be the reason for such poor results in your opinion?
EDIT:
I tried also Kernel SVM and logistic regression. They failed, with AUC = 0.6 and 0.55. Random forest and gradient boosting machines (and xgboost) were the best among those, altough they still produce poor results. Still can't figure out what improvements can I make.