I'm applying LogisticRegression on breastcancer dataset.
Steps : -
1- A correlation matrix resulted in only four features having >0 correlation value w.r.t. the target.
2- I used these four features and got very low test and train accuracies (0.55-0.63) with LogisticRegression and some other models.
3- I took 4 more features at random, those features have negative correlation w.r.t the target in the range [-0.7,-0.3] because I thought I'm just not using enough features, so model can't learn properly.
4- With 8 features in total, the test and train accuracies shot up to >0.9.
How can features that have negative correlation with the target can improve the model?