There have been various posts on this topic, but they don't really discuss the intuition behind the benefits of the stochastic local volatility (SLV) models over normal stochastic volatility (SV) models.
In other words:
Why does including the leverage function $L(S_t,t)$ in the SV model capture the whole volatility surface? Why can't $\rho$ in the SV model be correctly calibrated to match the affect of $L(S_t,t)$?
Why does not having $L(S_t,t)$ (So using an SV model) misprice deep ITM/OTM options and exotics?
If SLV models have those benefits over the standard SV models, why even use SV models?