So I made a linear regression in R Studio to predict the price of a car based on the year of fabrication. The data set is called "audi" and my linear regression looks like this:
library(tidyverse)
library(modelr)
...
model_price_Year <- lm(data = audi, price ~ year)
summary(model_price_Year)
The result of the summary is this:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.437e+06 8.503e+04 -75.71 <2e-16
year 3.203e+03 4.215e+01 75.98 <2e-16
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 9437 on 10666 degrees of freedom
Multiple R-squared: 0.3512, Adjusted R-squared: 0.3511
F-statistic: 5772 on 1 and 10666 DF, p-value: < 2.2e-16
Then, I made a grid and i added predictions for 100 values of the year. It looks like this:
grid_year <- audi %>%
data_grid(year = seq_range(year, 100)) %>%
add_predictions(model_price_Year, "price")
And after that, if i want to see results, they look like this:
year price
<dbl> <dbl>
1 1997 -41481.
2 1997. -40737.
3 1997. -39993.
4 1998. -39249.
5 1998. -38505.
6 1998. -37761.
7 1998. -37017.
8 1999. -36273.
9 1999. -35529.
10 1999. -34785.
They are all negative, and becuase we are talking about the price, it doesnt really make sense. Why are they negative? How do I interpret this?


R! Grab pencil and paper or chalk and chalkboard if you must. – whuber May 09 '22 at 15:45