I have the following data over time:

that means data collected for a single variable like CPU usage in lowest, highest, and average mode over time every 5 mins (data granularity = 5mins) like the following data frame:
| | timestamp | min cpu | max cpu | avg cpu |
|---:|:--------------------|----------:|------------:|------------:|
| 0 | 2017-01-01 00:00:00 | 715147 | 2.2233e+06 | 1.22957e+06 |
| 1 | 2017-01-01 00:05:00 | 700474 | 2.21239e+06 | 1.21132e+06 |
| 2 | 2017-01-01 00:10:00 | 705954 | 2.21306e+06 | 1.20663e+06 |
| 3 | 2017-01-01 00:15:00 | 688383 | 2.18757e+06 | 1.19037e+06 |
| 4 | 2017-01-01 00:20:00 | 688277 | 2.18368e+06 | 1.18099e+06 |
I sliced the dataframe and worked on a univariate time-series data problem as follows:
| | timestamp | avg cpu |
|---:|:--------------------|------------:|
| 0 | 2017-01-01 00:00:00 | 1.22957e+06 |
| 1 | 2017-01-01 00:05:00 | 1.21132e+06 |
| 2 | 2017-01-01 00:10:00 | 1.20663e+06 |
| 3 | 2017-01-01 00:15:00 | 1.19037e+06 |
| 4 | 2017-01-01 00:20:00 | 1.18099e+06 |
I split data and applied PI (Prediction Interval) using a regression:
| | pred | lower_bound | upper_bound |
|:--------------------|------------:|--------------:|--------------:|
| 2017-01-25 00:00:00 | 1.15232e+06 | 1.12482e+06 | 1.1874e+06 |
| 2017-01-25 00:05:00 | 1.14453e+06 | 1.10052e+06 | 1.18994e+06 |
| 2017-01-25 00:10:00 | 1.14033e+06 | 1.08739e+06 | 1.20795e+06 |
| 2017-01-25 00:15:00 | 1.13669e+06 | 1.0843e+06 | 1.20252e+06 |
| 2017-01-25 00:20:00 | 1.1271e+06 | 1.06837e+06 | 1.19865e+06 |

We know:
"Coherence: It is used for measuring the correlation between two signals. ... Coherence is the normalized cross-spectral density:" $$C x y=\frac{|P x y|^2}{P x x-P y y}$$ ref.
question:
Does one evaluate something potentially meaningful based on the coherence of predictions['upper_bound'] or predictions['pred'] with actual test data data_test['avg cpu']?

code:
#!pip install skforecast
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from sklearn.linear_model import Ridge, #Lasso, LinearRegression
from skforecast.ForecasterAutoreg import ForecasterAutoreg
Create and train forecaster
==============================================================================
forecaster = ForecasterAutoreg(
regressor = Ridge(alpha=0.1, random_state=765),
lags = 288
)
forecaster.fit(y=data_train['avg cpu'])
Prediction intervals
==============================================================================
predictions = forecaster.predict_interval(
steps = steps,
interval = [1, 99],
n_boot = 500
)
Prediction error
==============================================================================
error_mse2 = mean_squared_error(
y_true = data_test['avg cpu'],
y_pred = predictions['upper_bound']
)
print(f"Test error (MSE): {error_mse2}")
Plot forecasts with prediction intervals and coherence of signals
==============================================================================
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(6, 3))
plt.ylabel('cpu', fontsize=15)
plt.ticklabel_format(style='plain')
plt.xlabel('timestamp', fontsize=15, color='darkred')
cossignal1=data_test['avg cpu'].plot(ax=ax, label='Test-set', color='orange', linestyle='-.', marker="p")
cossignal2=predictions['upper_bound'].plot(ax=ax, label=f"predictions['upper_bound']" , color='darkred')
predictions['pred'].plot(ax=ax, label=f"predictions['pred']")
plt.title("Signals")
#place legend in top right corner
plt.legend(bbox_to_anchor=(1.6,.9), loc="upper right")
plt.show()
Store the value of correlation in a
variable say 'cor' using the following code:
fig, ax = plt.subplots(figsize=(6, 3))
cor=plt.cohere(data_test['avg cpu'],predictions['upper_bound'], c='g')
plt.title(f"Coherence of Signals: predictions['upper_bound'] and data_test['avg cpu']")
plot the coherence graph
ax.legend(['Coherence'])
plt.show()
Store the value of correlation in a
variable say 'cor' using the following code:
fig, ax = plt.subplots(figsize=(6, 3))
cor=plt.cohere(data_test['avg cpu'],predictions['pred'], c='g')
plt.title(f"Coherence of Signals: predictions['pred'] and data_test['avg cpu']")
plot the coherence graph
ax.legend(['Coherence'])
plt.show()
```
skforecastpython package for time-series analytics. Within this package regardless of theregressorone can choose (in my caseRidge) within theForecasterAutoreg()class, prediction Interval (PI) results return by usingpredict_interval(). Maybe using an inappropriate regressor likeRidgeoutputs this biased low prediction. maybe the PI approach within causes it. I included the Python codes for better understanding. – Mario Mar 12 '24 at 11:16