This is a great question because it explores the possibility of alternative procedures and asks us to think about why and how one procedure might be superior to another.
The short answer is that there are infinitely many ways we might devise a procedure to obtain a lower confidence limit for the mean, but some of these are better and some are worse (in a sense that is meaningful and well-defined). Option 2 is an excellent procedure, because a person using it would need to collect less than half as much data as a person using Option 1 in order to obtain results of comparable quality. Half as much data typically means half the budget and half the time, so we're talking about a substantial and economically important difference. This supplies a concrete demonstration of the value of statistical theory.
Rather than rehash the theory, of which many excellent textbook accounts exist, let's quickly explore three lower confidence limit (LCL) procedures for $n$ independent normal variates of known standard deviation. I chose three natural and promising ones suggested by the question. Each of them is determined by a desired confidence level $1-\alpha$:
Option 1a, the "min" procedure. The lower confidence limit is set equal to $t_{\min} = \min(X_1, X_2, \ldots, X_n) - k^{\min}_{\alpha, n, \sigma} \sigma$. The value of the number $k^{\min}_{\alpha, n, \sigma}$ is determined so that the chance that $t_{\min}$ will exceed the true mean $\mu$ is just $\alpha$; that is, $\Pr(t_{\min} \gt \mu) = \alpha$.
Option 1b, the "max" procedure. The lower confidence limit is set equal to $t_{\max} = \max(X_1, X_2, \ldots, X_n) - k^{\max}_{\alpha, n, \sigma} \sigma$. The value of the number $k^{\max}_{\alpha, n, \sigma}$ is determined so that the chance that $t_{\max}$ will exceed the true mean $\mu$ is just $\alpha$; that is, $\Pr(t_{\max} \gt \mu) = \alpha$.
Option 2, the "mean" procedure. The lower confidence limit is set equal to $t_\text{mean} = \text{mean}(X_1, X_2, \ldots, X_n) - k^\text{mean}_{\alpha, n, \sigma} \sigma$. The value of the number $k^\text{mean}_{\alpha, n, \sigma}$ is determined so that the chance that $t_\text{mean}$ will exceed the true mean $\mu$ is just $\alpha$; that is, $\Pr(t_\text{mean} \gt \mu) = \alpha$.
As is well known, $k^\text{mean}_{\alpha, n, \sigma} = z_\alpha/\sqrt{n}$ where $\Phi(z_\alpha) = 1-\alpha$; $\Phi$ is the cumulative probability function of the standard Normal distribution. This is the formula cited in the question. A mathematical shorthand is
- $k^\text{mean}_{\alpha, n, \sigma} = \Phi^{-1}(1-\alpha)/\sqrt{n}.$
The formulas for the min and max procedures are less well known but easy to determine:
$k^\text{min}_{\alpha,n,\sigma} = \Phi^{-1}(1-\alpha^{1/n})$.
$k^\text{max}_{\alpha, n, \sigma} = \Phi^{-1}((1-\alpha)^{1/n})$.
By means of a simulation, we can see that all three formulas work. The following R code conducts the experiment n.trials separate times and reports all three LCLs for each trial:
simulate <- function(n.trials=100, alpha=.05, n=5) {
z.min <- qnorm(1-alpha^(1/n))
z.mean <- qnorm(1-alpha) / sqrt(n)
z.max <- qnorm((1-alpha)^(1/n))
f <- function() {
x <- rnorm(n);
c(max=max(x) - z.max, min=min(x) - z.min, mean=mean(x) - z.mean)
}
replicate(n.trials, f())
}
(The code does not bother to work with general normal distributions: because we are free to choose the units of measurement and the zero of the measurement scale, it suffices to study the case $\mu=0$, $\sigma=1$. That is why none of the formulas for the various $k^*_{\alpha,n,\sigma}$ actually depend on $\sigma$.)
10,000 trials will provide sufficient accuracy. Let's run the simulation and calculate the frequency with which each procedure fails to produce a confidence limit less than the true mean:
set.seed(17)
sim <- simulate(10000, alpha=.05, n=5)
apply(sim > 0, 1, mean)
The output is
max min mean
0.0515 0.0527 0.0520
These frequencies are close enough to the stipulated value of $\alpha=.05$ that we can be satisfied all three procedures work as advertised: each one of them produces a 95% confidence lower confidence limit for the mean.
(If you're concerned that these frequencies differ slightly from $.05$, you can run more trials. With a million trials, they come even closer to $.05$: $(0.050547, 0.049877, 0.050274)$.)
However, one thing we would like about any LCL procedure is that not only should it be correct the intended proportion of time, but it should tend to be close to correct. For instance, imagine a (hypothetical) statistician who, by virtue of a deep religious sensibility, can consult the Delphic oracle (of Apollo) instead of collecting the data $X_1, X_2, \ldots, X_n$ and doing an LCL computation. When she asks the god for a 95% LCL, the god will just divine the true mean and tell that to her--after all, he's perfect. But, because the god does not wish to share his abilities fully with mankind (which must remain fallible), 5% of the time he will give an LCL that is $100\sigma$ too high. This Delphic procedure is also a 95% LCL--but it would be a scary one to use in practice due to the risk of it producing a truly horrible bound.
We can assess how accurate our three LCL procedures tend to be. A good way is to look at their sampling distributions: equivalently, histograms of many simulated values will do as well. Here they are. First though, the code to produce them:
dx <- -min(sim)/12
breaks <- seq(from=min(sim), to=max(sim)+dx, by=dx)
par(mfcol=c(1,3))
tmp <- sapply(c("min", "max", "mean"), function(s) {
hist(sim[s,], breaks=breaks, col="#70C0E0",
main=paste("Histogram of", s, "procedure"),
yaxt="n", ylab="", xlab="LCL");
hist(sim[s, sim[s,] > 0], breaks=breaks, col="Red", add=TRUE)
})

They are shown on identical x axes (but slightly different vertical axes). What we are interested in are
The red portions to the right of $0$--whose areas represent the frequency with which the procedures fail to underestimate the mean--are all about equal to the desired amount, $\alpha=.05$. (We had already confirmed that numerically.)
The spreads of the simulation results. Evidently, the rightmost histogram is narrower than the other two: it describes a procedure that indeed underestimates the mean (equal to $0$) fully $95$% of the time, but even when it does, that underestimate is almost always within $2 \sigma$ of the true mean. The other two histograms have a propensity to underestimate the true mean by a little more, out to about $3\sigma$ too low. Also, when they overestimate the true mean, they tend to overestimate it by more than the rightmost procedure. These qualities make them inferior to the rightmost histogram.
The rightmost histogram describes Option 2, the conventional LCL procedure.
One measure of these spreads is the standard deviation of the simulation results:
> apply(sim, 1, sd)
max min mean
0.673834 0.677219 0.453829
These numbers tell us that the max and min procedures have equal spreads (of about $0.68$) and the usual, mean, procedure has only about two-thirds their spread (of about $0.45$). This confirms the evidence of our eyes.
The squares of the standard deviations are the variances, equal to $0.45$, $0.45$, and $0.20$, respectively. The variances can be related to the amount of data: if one analyst recommends the max (or min) procedure, then in order to achieve the narrow spread exhibited by the usual procedure, their client would have to obtain $0.45/0.21$ times as much data--over twice as much. In other words, by using Option 1, you would be paying more than twice as much for your information than by using Option 2.