There is no error but there is a subtlety. Note: In the second edition of ISLR model selection is discussed on pages 232-235 [1].
Let's start by deriving the log-likelihood for linear regression as it's at the heart of this question.
The likelihood is a product of Normal densities. Evaluated at the MLE:
$$
\hat{L} = \prod_{i=1}^n\frac{1}{\sqrt{2\pi\hat{\sigma}^2}}\exp\left\{-\frac{(y_i - \hat{y}_i)^2}{2\hat{\sigma}^2}\right\}
$$
where n is the number of data points and $\hat{y}_i$ is the prediction, so $y_i - \hat{y}_i$ is the residual.
We take the log and keep track of constants as they are important later on.
$$
\log(\hat{L}) = -\frac{n}{2}\log(2\pi\hat{\sigma}^2) - \sum_{i=1}^n \frac{-(y_i - \hat{y}_i)^2}{2\hat{\sigma}^2} = -\frac{n}{2}\log(2\pi\hat{\sigma}^2) - \frac{RSS}{2\hat{\sigma}^2}
$$
where RSS is the residual sum of squares.
What about the MLE $\hat{\sigma}^2$ of the error variance $\sigma^2$? It's also a function of the RSS.
$$
\hat{\sigma}^2 = \frac{RSS}{n}
$$
And here is the subtle point. For model selection with AIC and BIC ISLR uses the $\hat{\sigma}^2$ from the full model to compare all nested models. Let's call this residual variance $\hat{\sigma}^2_{full}$ for clarity.
Finally we write down the Bayesian information criterion (BIC). d is the number of fixed effects.
$$
BIC = -2 \log(\hat{L}) + \log(n)d = n\log(2\pi\hat{\sigma}^2_{full}) + \frac{RSS}{\hat{\sigma}^2_{full}} + \log(n)d \\ = c_0 + c_1\left(RSS + \log(n)d\hat{\sigma}^2_{full}\right)
$$
This is Equation (6.3) in ISLR up to two constants, $c_0$ and $c_1=\hat{\sigma}^{-2}_{full}$, that are the same for all models under consideration. ISLR also divides BIC by the sample size n.
What if we want to estimate $\sigma^2$ separately for each model? Then we plug in the MLE $\hat{\sigma}^2$ = RSS/n and we get the "more popular" formulation. We add 1 to the number of parameters because we estimate the error variance plus the d fixed effects.
$$
BIC = n\log(2\pi\hat{\sigma}^2) + \frac{RSS}{\hat{\sigma}^2} + \log(n)(d+1)\\
= n\log(2\pi RSS/n) + \frac{RSS}{RSS/n} + \log(n)(d+1)\\
= c^*_0 + n\log(RSS) + \log(n)(d+1)
$$
The residual sum of squares RSS is the same in both versions of the BIC. [Since the effect estimates are $\hat{\beta} = (X'X)^{-1}X'Y$ and the predictions $X(X'X)^{-1}X'Y$ don't depend on $\sigma^2$.]
[1] G. James, D. Witten, T. Hastie, and R. Tibshirani. An Introduction to Statistical Learning with Applications in R. Springer, 2nd edition, 2021. Available online.