1

I have this simple scatter plot that has the blue dots for the real values and cross marks for predicted values. Please do ignore the second empty plotscatter

What I want are vertical lines between each blue dot and cross marks. A simple loop should do along with ax.axvline() method. But what I'm getting is this: scatter2

The true and predicted values are in digits of thousands. My approach is that since ax.axvline() takes in ymin and ymax within 0 and 1 only, I can convert each y_true and y_pred to decimals by dividing it by 10000. But what I get above is a messy plot.

Code I've used is simple:

plt.style.use("default")

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 8))

ax1.scatter(np.arange(0, 50), y_train[:50], marker="o", label="True")
ax1.scatter(np.arange(0, 50), preds_train_lm[:50], marker="x", color="brown", label="Predicted")
for ix, v1, v2, in zip(np.arange(0, 50), y_train.values[:50], preds_train_lm[:50]):
    ax1.axvline(x=ix, ymin=v1/10000, ymax=v2/10000)
ax1.legend()

Any other approach how I can have exact lines between each pair of markers?

Thanks in advance.

shiv_90
  • 957
  • 3
  • 12
  • 32

1 Answers1

1

Switch to Axes.vlines for this use case:

  1. vlines uses data coordinates instead of figure fractions
  2. vlines accepts arrays

So you can avoid any scaling/looping by passing the raw arrays directly into ax1.vlines:

ax1.vlines(np.arange(50), y_train.values[:50], preds_train_lm[:50])
tdy
  • 26,545
  • 9
  • 43
  • 50