I'm reading the the first chapter of Understanding machine learning from theory to algorithms and they said that:
Let $H_B$ be the set of "bad" hypotheses, that is
$H_B=\left\{h \in H:L_{(D,f)(h)}\gt e\right\}$ ($e$ is the accuracy parameter)
Let
$M=\left\{S|_x : \exists \ h \in H_B, L_s(h) =0 \right\}$
be the set of misleading samples: Namely, for every $S|_x \in M$, there is a "bad" hypothesis, $h \in H_B$, that looks like a "good" hypothesis on $S|_x$. Now, recall that we would like to bound the probability of the event $L_{(D,f)} \gt e$. But, since the realizability assumption implies that $L_s(h_s)=0$, it follows that the event $L_{(D,f)}(h_s)\gt e$ can only happen if for some $h \in H_B$ we have $L_s(h) = 0$. In other words, this event will only happen if our sample is in the set of misleading samples $M$. Formally, we have shown that
$\left\{S|_x : L_{(D,f)}(h_S)\gt e \right\} \subseteq M$
I'm so confused about this conclusion. Can someone please explain this to me? Thanks for your time!