Since there's a good amount of discussion in the comments I will demonstrate why the normality of the residuals is a poor measure. I will use Shapiro Wilks test instead of KS, as KS needs to be supplied with estimated $\mu$ and $\sigma$.
Let's consider $Y = aX + b + \epsilon$, with $a = 1, b = 0, \epsilon\sim\mathcal N (0,1)$, as well as $X \sim \mathcal N (0,1)$ and simulate with $n = 300$.
set.seed("112")
x <- rnorm(300)
y <- x + rnorm(300)
First of whubers point about the mean of the residual not mattering for normality, let us test 3 values of $b: 0, 1, 99999$, while using the coreect value for $a$. It doesn't make a difference.
> shapiro.test(y - x)
Shapiro-Wilk normality test
data: y - x
W = 0.99404, p-value = 0.2873
> shapiro.test(y - x - 1)
Shapiro-Wilk normality test
data: y - x - 1
W = 0.99404, p-value = 0.2873
> shapiro.test(y - x - 99999) #if you use too many 9s things get rounded
Shapiro-Wilk normality test
data: y - x - 99999
W = 0.99404, p-value = 0.2873
Now my point about, if $X$ is normal, then in a finite sample you will have some value for $a$ that maximizes normality, but this point is random. I'll test values from -10 to 10 and plot the p-values i.e. biggest is best, true value in black and slope of the regression in dashed in red:
a_s <- seq(-10, 10, by = 0.1)
res <- sapply(a_s, FUN = function(a){shapiro.test(y - a*x)})
res <- as.data.frame(t(res))
plot(a_s, res$p.value, ylim = c(0, 1))
abline(v = 1)
abline(v = coef(lm(y ~ x))[2], lty = 2, col = 2)

The highest p-value lies around $-1$! If you repeat the simulation you will see a wide variety of shapes, some of them peaking around 1 but others dipping, which is the opposite of what you want.
Just run this code snipped over and over again and you will see how ill defined this approach is:
x <- rnorm(300)
y <- x + rnorm(300)
res <- sapply(a_s, FUN = function(a){shapiro.test(y - a*x)})
res <- as.data.frame(t(res))
plot(a_s, res$p.value, ylim = c(0, 1))
abline(v = 1)
abline(v = coef(lm(y ~ x))[2], lty = 2, col = 2)
If you replace x <- rnorm(300) with x <- runif(300) you will get better behaved stuff, but it will still be incredibly weird. Using a KS-test only makes things worse.