The histogram completely misses the pairing between predicted and observed values, so the histogram has nothing to do with the all-important residuals used to calculate $R^2$. For instance, if you permute the predicted values, you do not change the histogram, yet the prediction quality certainly changes.
In fact, you could predict the correct set of values yet, because the predictions are not paired with the observations to which they should correspond, your ability to predict is terrible. For instance, consider the true values to be $(1, 2, 3, 4, 5)$ and the predictions to be $(5, 1, 4, 2, 3)$. These distributions are identical, yet the predictions are far from perfect. Contrast that with predictions of $(2, 2.5, 3, 3.5, 4)$, which have a different distribution but are much better predictions.
While there is some value to having predictions that vary the way the true observations do, you will find it much more informative to graph a scatterplot of the predicted and true values.