If I understand correctly, it seems you are trying to determine
- The best linear regression model from a set of several variables
- Which variables are statistically significant (and which are not)
One word of caution, however:
We want to do some inference, in particular we're interest in
understanding if race and gender are factors that determine the
salary. The underlying hypothesis is that the company is
racist/sexist.
Unless your data comes from a fancy experiment with controlled variables etc.. statistical significance does not necessarily imply causality. With that out of the way I don't see why you couldn't combine the two approaches. I see two main steps instead:
Decide on the comparison criteria
The following will help you compare different models between each other:
- Mallow's $C_p$
- Akaike's Information Criteria (AIC)
- Bayesian Information Criteria (BIC)
- Adjusted $R^2$
- Fisher test for model significance
- Fishier statistic for the test of significance of nested models
You can also evaluate the significance of variables using
- Student T statistic for the test of significance of parameter
Theoretical justification
$C_p$, AIC, BIC all have rigorous theoretical justification that rely on asymptotic arguments, i.e. when the sample size $m$ grows very large, whereas the adjusted $R^2$, although quite intuitive, is not as well motivated in statistical theory.
Decide on the iterative procedure
Best subset selection
Fit separate models for each possible combination of the $n$ predictors and then select the best subset. That is we fit:
- All models that contains exactly one predictor
- All models that contain 2 predictors at the second step: $\binom{K}{2}$
- Until reaching the end point where all $K$ predictors are included in the model
Given K explanatory variables, you will have $2^K$ different models to compare, which can become quite large ! For example $2^{11} = 2048$. For computational reasons, the best subset cannot be applied for any large $K$. You can use instead:
Forward stepwise selection
Forward Stepwise begins with a model containing no predictors, and then adds predictors to the model, one at the time. At each step, the variable that gives the greatest additional improvement to the fit is added to the model.
Backward stepwise, hybrids and many more...
In practice
Since you have a small number of explanatory variables, I would perform a best subset procedure, compare all the models using the various criteria mentionned above, and then see if indeed your preferred variables are statistically significant and /or lead to the best model.
Don't forget to check for multi-correlation, confidence intervals, and validate linear regression assumptions such as normality of the errors and homoscedasticity of the variance.
Here is a graph of what best subset selection could look like using Residual Sum of Squares (RSS) and $R^2$ criteria

And here what a forward stepwise selection would look like on the same data

Code can be found here