I am an MBA Student taking courses in Statistics.
In our courses, we are learning about Regression Models and how to interpret the meaning of the parameters/coefficients in a Regression Model.
For example, when studying "vanilla" Regression Models (i.e. Simple Linear Regression/ Multiple Linear Regression), we are told that the regression coefficients represent a "unit change effect in the response variable". For example, if you build a Regression Model where the independent variables are "age and weight", and the dependent variable is "salary" - the regression coefficients (provided they are "Statistically Significant") could answer questions such as "the effect of every additional kilogram on salary". It would appear that these effects are always "linear" - the effect of weight on salary keeps getting bigger as the weight of an individual increases.
Given this, I had the following question. Suppose in some imaginary universe, very skinny people and very fat people earn a lot of money - but people with average weights do not earn a lot of money. Conceptually, this would appear to be a "non-linear" effect (e.g. imagine a graph between weight and salary - it would appear like a "U shape"). If I were to fit a Regression Model to data from this universe, the coefficients of this Regression Model would tell me that on average: as the weight of an individual increases by one unit, the average effect on the salary of an individual is 0.83 (for example).
But this is not the case - we know that in this example, there is a non-linear relationship between the dependent variable and the independent variable. But it seems to me that the Regression Model would still "insist" that there is a linear relationship.
Are there any types of Regression Models that can address this issue? For example, could something like a Polynomial Regression (e.g. https://en.wikipedia.org/wiki/Polynomial_regression) address this issue? What kind of Regression Models could help me specifically recover the non-linear effect of an independent variable (e.g. weight) on the dependent variable (e.g. salary)? It seems to me that a Polynomial Regression Model might fit such a dataset better, but I am not sure if it would address my question on the interpretation of regression coefficients. Is there some type of Regression Model in which the coefficient for weight would tell me that:
- between 0 to 100 lbs, the effect of weight on salary is 0.91
- between 101 to 200 lbs, the effect of weight on salary is 0.34
- between 201 lbs to 250 lbs, the effect of weight on salary is 0.86
I think I could split the data into 3 groups of people based on arbitrary weight ranges and fit a Regression Model to each one of these groups - and "hope" for linear relationship effects between weight and salary within each group. But this would result in "discrete and chunky" (how many groups should I make : 3? 4? 5? 6???) interpretation whereas I was hoping for a "continuous and smooth relationship" interpretation (e.g. a mathematical function that shows a smooth continuous relationship between a unit increase in the independent variable on the dependent variable ) . I was wondering - is this possible to do within a single model?