Why can't we use something like this instead when we're doing a
regression ?
You can, but you have to be aware of how to use it in your model.
In your example you assign integers to each distinct value of a categorical variable, like class or location. A major issue with this that comes to mind is that this implies an order to the data that is not necessarily true.
For class an ordering like
- child
- adult
- senior
might work in the sense of an age order. If we regard age, the expression child < adult < senior (1 < 2 < 3) is true.
What about location? Consider these example expressions:
- London > Glasgow
- Birmingham < Coventry
While there are causal setups in which one could frame such orders their usage should be very explicitly explained and cautiously used.
I want to elaborate on in the context of regression.
Consider a linear regression model of the above data of the form
$$\mathrm{income} = \beta_0 + \beta_1 \mathrm{class} + \beta_2 \mathrm{location}$$
Let's say after fitting this model you inspect its parameters and find that the estimate of $\beta_1$ is, let's say, 13000. With all other things being equal, an increase of class by 1 unit therefore corresponds to an increase in income by 13000 units. By applying the ordering suggested above, this becomes interpretable. This would have been much less so in the case of an ordering like you listed it, 1=adult, 2=senior, 3=child.
There are categorical variables which might not even have an identifieable ordering like this, in which case such an approach to incorporating the data in your model is confusing at best. This might just be the case with location.
However such a procedure is not at all to be discarded, just to be used differently than the standard approach shown above for numeric variables.
You can use an integer mapping of these categorical variables, but in a particular way. Let's revisit the above regression model again, but in a form that uses the categorical variables differently:
$$\mathrm{income} = \beta_0 + \beta_{1,class} + \beta_{2,location}$$
With $class$ and $location$ here being the integer version of your categorical features. Here, $\beta_1$ and $\beta_2$ are vectors that contain a separate parameter value per distinct value of the respective variables. Here the interpretation shifts from the above coefficient to more of a category-specific intercept, a change in the income if e.g. class=adult. class and location are indicator variables in this case.
Note that this is in principle the same as it would be with a dummy encoding.
In the model
$$\mathrm{income} = \beta_0 + \beta_1 adult + \beta_2 senior + ...$$
where every distinct value of the categorical variables its own term in the model with its own coefficient (leaving out one of them), the coefficients take on the meaning of a relative change if e.g. adult=1 .