The second answer is one of the assumptions made in linear regression. This assumption is typically written as
$\epsilon|X \sim~ N(0,\sigma^2I)$
where $\epsilon|X$ denotes the error given the values of the X variables. See e.g. William Green, Econometric Analysis, Fourth Edition page 222. It is not necessarily true that a normal distribution of all error terms considered together would guarantee that normality also holds for each separate X value! To give an example, suppose there is only one X variable with two values, 0 and 1.
For $X=0$ the error terms come from a truncated standard normal with lower bound -1 and upper bound +1, which could be written as "truncnorm(-1,+1)". So the mean of these error terms is zero.
For $X=1$, half of the error terms come from truncnorm(-Infinity, -1) and the other half comes from truncnorm(+1, +Infinity). The mean of these error terms is also zero.
So the error terms of both X values do not have the same distribution and variance for both X values, but that is not the issue of this question.
The point is that the entire set of errors does have a normal distribution, whereas for each X value this is NOT true.
Here is an R script which generates data as in this example.
library(truncnorm)
library(ggplot2)
set.seed(12345)
n <- 10000
e0 <- rtruncnorm(0.68n, a=-1, b=1)
e1 <- c(rtruncnorm(0.16n, a=-Inf, b=-1), rtruncnorm(0.16*n, a=1, b=Inf))
e <- c(e0, e1)
x <- c(rep(0,0.68n), rep(1,0.32n))
y <- 1 + x + e
model <- lm(y ~ x)
resid <- residuals(model)
hist(resid, breaks=50)
The histogram of the estimated errors (residuals) is:

For $X=0$ we get:
hist(resid[x==0], breaks=20)

For $X=1$ we get:
hist(resid[x==1], breaks=20)

The idea that the errors for each separate X value should be normally distributed is sometimes graphically shown as follows:

For each separate (combination of) X value(s) there is one and the same normal distribution from which the data are randomly and independently drawn. That is the idea behind "ordinary" regression. So, it is not enough to formulate the condition as: "across all cases the errors should be normally distributed".