I try to predict whether households use a certain service (TRUE or FALSE) based on various variables, using logistic (LASSO) regression.
Among many others, I have the variables percentage man and percentage woman, which have a -.85 Pearson's correlation coefficient with each other. However, when I run the logistic regression they both have a beta-coefficient of respectively 3.34 and 3.16, which puts them both in the top 40 of most predictive variables among the 150 variables I use.
How can they both be a positive predictor for the label when they are so negatively correlated with each other?
EDIT: some extra info that might be of interest: percentage man correlates with the outcome variable by a Pearson's correlation coefficient of 0.041, and percentage woman by -0.045
x1andx2, have a correlation coefficient of $-0.78.$ That example is constructed to make the coefficients $\beta$ equal to $5$ and $-1,$ but by changing them both to positive numbers (try $5$ for both and generate 40 points) you can create a dataset with the properties you describe. Ergo, the explanations of this phenomenon in that thread will answer your question. (There is no essential difference between logistic regression and OLS for this purpose.) – whuber Jun 03 '19 at 15:27