Some Plots and A Sprinkle of Math
I think your question seems to be specific to why it matters. I finish this answer with the most practical answer to your question, but I think this is where data and theory also tend to merge. Let's say you have a theory that drinking lots of alcohol makes you sleep more on average. Your theory stipulates that the alcohol should be the predictor. It should be apparent that sleeping doesn't make you drink more, but a regression may make it seem so if there is a strong enough correlation between the two variables.
If we think of the most simple algebraic reason why, we have to remember that the estimated outcome between two variables is a function of x, or $y=f(x)$. As such, there is a relationship we are trying our best to capture in a regression that predicts y, and we can only do that by having a relationship that makes sense. Using R, we can simulate this. Let's say we have a variable that has a negative parabolic relationship with y (x first causes y to increase, levels off, then decreases). We can simulate this below:
#### Create Estimated Y Dependent on X ####
y.hat <- function(x){
y <- -x^2
return(y)
}
Make Random 1000 Values of X
x <- rnorm(n=1000)
Convert X Values to Their Y Value
y <- y.hat(x)
Plot
df %>%
ggplot(aes(x,y))+
geom_point()+
labs(x="Age",
y="Memory",
title = "Conditional Relationship of Age and Memory",
subtitle = "y = -x^2")+
geom_smooth(method = "loess")+
theme_bw()+
theme(axis.text = element_blank(),
plot.subtitle = element_text(face = "italic"))
Ignoring the x and y axis labels as well as the actual raw data because I'm being lazy here, we know that theoretically this relationship should exist with age and short term memory...as we are young we have terrible memory and as we get old we have terrible memory, but this memory increases and stabilizes as we approach a certain age. Fitting a loess regression line here perfectly emulates this relationship:

Flipping it would make no sense...while memory is likely predictive of age, you can see here that plotting it makes this relationship visually confusing:

This is additionally why fitting this relationship matters. If we fit a vanilla regression line to this, it would be similarly erroneous because the relationship between the two is parabolic, not additive like typical linear regressions:

The Practical Side
Additionally, running a regression where x is the parabolic function of y, this would almost certainly be wrong if we kept the data the same.
#### Incorrect Fit to Data ####
false.fit <- lm(x ~ poly(-y^2),
data = df)
summary(false.fit)
Correct Fit to Data
true.fit <- lm(y ~ poly(-x^2),
data = df)
summary(true.fit)
Check out the comparisons between an exponentiated x and an exponentiated y...one is almost not predictive at all as the R2 is nearly zero:
Call:
lm(formula = x ~ poly(-y^2), data = df)
Residuals:
Min 1Q Median 3Q Max
-5.1529 -0.6162 0.0059 0.7233 1.8945
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.06526 0.03101 2.104 0.035613 *
poly(-y^2) -3.76537 0.98074 -3.839 0.000131 ***
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9807 on 998 degrees of freedom
Multiple R-squared: 0.01455, Adjusted R-squared: 0.01357
F-statistic: 14.74 on 1 and 998 DF, p-value: 0.0001311
Versus the true fit here. The R2 is exactly 1. It actually gives a warning that the fit is too perfect because...it is:
Call:
lm(formula = y ~ poly(-x^2), data = df)
Residuals:
Min 1Q Median 3Q Max
-1.018e-15 -1.190e-16 -6.500e-17 -1.400e-17 3.391e-14
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -9.784e-01 4.400e-17 -2.224e+16 <2e-16 ***
poly(-x^2) 4.214e+01 1.391e-15 3.029e+16 <2e-16 ***
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.391e-15 on 998 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 9.172e+32 on 1 and 998 DF, p-value: < 2.2e-16
Warning message:
In summary.lm(true.fit) :
essentially perfect fit: summary may be unreliable
You may now be asking yourself why any of this matters in a real world scenario. Going back to the alcohol and sleep example...lets say we found a strong correlation between the two and a regression fit is shown to be significant when alcohol is the dependent variable. Therapists around the country may then incorrectly conclude that they should decrease alcohol consumption by reducing people's sleep rather than reduce alcohol consumption in other ways. This would likely have the opposite effect of helping behavior...as people sleep less they would become more stressed and thus drink more. This could potentially lead to even less sleep on average as well because chaotic sleep mixed with more alcohol would probably lead to compound effects on sleep quality. This may have catastrophic outcomes that lead to injury, spousal abuse, and a litany of other alcohol-related social harms. So from a practical perspective, this actually matters a lot.
Summary
In summary, this relationship matters for three reasons: first, your predictor should be theoretically valid. Second, your regression outcome is contingent upon the relationship between the two. Third, there are certainly real-world outcomes at stake that make this predictive relationship sensible.