Let's say I am building a logistic regression model where the dependent variable is binary and can take the values $0$ or $1$. Let the independent variables be $x_1, x_2, ..., x_m$ - there are $m$ independent variables. Let's say for the $k$th independent variable, the bivariate analysis shows a U-shaped trend - i.e., if I group $x_k$ into $20$ bins each containing roughly equal number of observations and calculate the 'bad rate' for each bin - # observations where y = 0 / total observations in each bin - then I get a U shaped curve.
My questions are:
- Can I directly use $x_k$ as input while estimating the beta parameters? Are any statistical assumptions violated which might cause significant error in estimating the parameters?
- Is it necessary to 'linearize' this variable through a transformation (log, square, product with itself, etc.)?