I am using statsmodels.tsa.holtwinters.ExponentialSmoothing to perform Holt Winters' Additive method, first on training dataset and later on the whole dataset. After training and testing, I take the parameters of the exponential smoothing instance and assign it to the model that will be fit on the entire data, but then the outcome ends up having NaNs for the forecasts, for the levels, trends, and seasonal values.
Training Data:
Date
2016-07-31 3349
2016-08-31 401
2016-09-30 314
2016-10-31 473
2016-11-30 1415
2016-12-31 2351
2017-01-31 1834
2017-02-28 1924
2017-03-31 1291
2017-04-30 2737
2017-05-31 2919
2017-06-30 1098
2017-07-31 3032
2017-08-31 1973
2017-09-30 1196
2017-10-31 1611
2017-11-30 832
2017-12-31 768
2018-01-31 3051
2018-02-28 1100
2018-03-31 1606
2018-04-30 526
2018-05-31 808
2018-06-30 788
2018-07-31 5040
2018-08-31 304
2018-09-30 1709
2018-10-31 479
2018-11-30 1884
2018-12-31 681
2019-01-31 806
2019-02-28 1083
2019-03-31 1338
2019-04-30 1293
2019-05-31 1926
2019-06-30 700
2019-07-31 322
2019-08-31 298
2019-09-30 932
2019-10-31 2211
2019-11-30 1611
2019-12-31 892
2020-01-31 1189
2020-02-29 7015
2020-03-31 2609
2020-04-30 6072
2020-05-31 9651
2020-06-30 13114
Dictionary output of Training Data:
{'No. of Participants Series': {Timestamp('2016-07-31 00:00:00', freq='M'): 3349, Timestamp('2016-08-31 00:00:00', freq='M'): 401, Timestamp('2016-09-30 00:00:00', freq='M'): 314, Timestamp('2016-10-31 00:00:00', freq='M'): 473, Timestamp('2016-11-30 00:00:00', freq='M'): 1415, Timestamp('2016-12-31 00:00:00', freq='M'): 2351, Timestamp('2017-01-31 00:00:00', freq='M'): 1834, Timestamp('2017-02-28 00:00:00', freq='M'): 1924, Timestamp('2017-03-31 00:00:00', freq='M'): 1291, Timestamp('2017-04-30 00:00:00', freq='M'): 2737, Timestamp('2017-05-31 00:00:00', freq='M'): 2919, Timestamp('2017-06-30 00:00:00', freq='M'): 1098, Timestamp('2017-07-31 00:00:00', freq='M'): 3032, Timestamp('2017-08-31 00:00:00', freq='M'): 1973, Timestamp('2017-09-30 00:00:00', freq='M'): 1196, Timestamp('2017-10-31 00:00:00', freq='M'): 1611, Timestamp('2017-11-30 00:00:00', freq='M'): 832, Timestamp('2017-12-31 00:00:00', freq='M'): 768, Timestamp('2018-01-31 00:00:00', freq='M'): 3051, Timestamp('2018-02-28 00:00:00', freq='M'): 1100, Timestamp('2018-03-31 00:00:00', freq='M'): 1606, Timestamp('2018-04-30 00:00:00', freq='M'): 526, Timestamp('2018-05-31 00:00:00', freq='M'): 808, Timestamp('2018-06-30 00:00:00', freq='M'): 788, Timestamp('2018-07-31 00:00:00', freq='M'): 5040, Timestamp('2018-08-31 00:00:00', freq='M'): 304, Timestamp('2018-09-30 00:00:00', freq='M'): 1709, Timestamp('2018-10-31 00:00:00', freq='M'): 479, Timestamp('2018-11-30 00:00:00', freq='M'): 1884, Timestamp('2018-12-31 00:00:00', freq='M'): 681, Timestamp('2019-01-31 00:00:00', freq='M'): 806, Timestamp('2019-02-28 00:00:00', freq='M'): 1083, Timestamp('2019-03-31 00:00:00', freq='M'): 1338, Timestamp('2019-04-30 00:00:00', freq='M'): 1293, Timestamp('2019-05-31 00:00:00', freq='M'): 1926, Timestamp('2019-06-30 00:00:00', freq='M'): 700, Timestamp('2019-07-31 00:00:00', freq='M'): 322, Timestamp('2019-08-31 00:00:00', freq='M'): 298, Timestamp('2019-09-30 00:00:00', freq='M'): 932, Timestamp('2019-10-31 00:00:00', freq='M'): 2211, Timestamp('2019-11-30 00:00:00', freq='M'): 1611, Timestamp('2019-12-31 00:00:00', freq='M'): 892, Timestamp('2020-01-31 00:00:00', freq='M'): 1189, Timestamp('2020-02-29 00:00:00', freq='M'): 7015, Timestamp('2020-03-31 00:00:00', freq='M'): 2609, Timestamp('2020-04-30 00:00:00', freq='M'): 6072, Timestamp('2020-05-31 00:00:00', freq='M'): 9651, Timestamp('2020-06-30 00:00:00', freq='M'): 13114}}
Testing Data:
Date
2020-07-31 16693
2020-08-31 14797
2020-09-30 7066
2020-10-31 11157
2020-11-30 5737
2020-12-31 11147
2021-01-31 14031
2021-02-28 1847
2021-03-31 6549
2021-04-30 14614
2021-05-31 8315
2021-06-30 4372
Dictionary output of Testing Data:
{'No. of Participants Series': {Timestamp('2020-07-31 00:00:00', freq='M'): 16693, Timestamp('2020-08-31 00:00:00', freq='M'): 14797, Timestamp('2020-09-30 00:00:00', freq='M'): 7066, Timestamp('2020-10-31 00:00:00', freq='M'): 11157, Timestamp('2020-11-30 00:00:00', freq='M'): 5737, Timestamp('2020-12-31 00:00:00', freq='M'): 11147, Timestamp('2021-01-31 00:00:00', freq='M'): 14031, Timestamp('2021-02-28 00:00:00', freq='M'): 1847, Timestamp('2021-03-31 00:00:00', freq='M'): 6549, Timestamp('2021-04-30 00:00:00', freq='M'): 14614, Timestamp('2021-05-31 00:00:00', freq='M'): 8315, Timestamp('2021-06-30 00:00:00', freq='M'): 4372}}
The entire dataset is essentially both of the above combined.
My code for the entire modeling process, along with the data are below:
# Importing required packages
import pandas as pd
from pandas import Timestamp
import numpy as np
from statsmodels.tsa.holtwinters import ExponentialSmoothing
Getting data from dictionary and turning to Pandas Dataframe
train_data_dict = {'Count of Participants': {Timestamp('2016-07-31 00:00:00'): 3349, Timestamp('2016-08-31 00:00:00'): 401, Timestamp('2016-09-30 00:00:00'): 314, Timestamp('2016-10-31 00:00:00'): 473, Timestamp('2016-11-30 00:00:00'): 1415, Timestamp('2016-12-31 00:00:00'): 2351, Timestamp('2017-01-31 00:00:00'): 1834, Timestamp('2017-02-28 00:00:00'): 1924, Timestamp('2017-03-31 00:00:00'): 1291, Timestamp('2017-04-30 00:00:00'): 2737, Timestamp('2017-05-31 00:00:00'): 2919, Timestamp('2017-06-30 00:00:00'): 1098, Timestamp('2017-07-31 00:00:00'): 3032, Timestamp('2017-08-31 00:00:00'): 1973, Timestamp('2017-09-30 00:00:00'): 1196, Timestamp('2017-10-31 00:00:00'): 1611, Timestamp('2017-11-30 00:00:00'): 832, Timestamp('2017-12-31 00:00:00'): 768, Timestamp('2018-01-31 00:00:00'): 3051, Timestamp('2018-02-28 00:00:00'): 1100, Timestamp('2018-03-31 00:00:00'): 1606, Timestamp('2018-04-30 00:00:00'): 526, Timestamp('2018-05-31 00:00:00'): 808, Timestamp('2018-06-30 00:00:00'): 788, Timestamp('2018-07-31 00:00:00'): 5040, Timestamp('2018-08-31 00:00:00'): 304, Timestamp('2018-09-30 00:00:00'): 1709, Timestamp('2018-10-31 00:00:00'): 479, Timestamp('2018-11-30 00:00:00'): 1884, Timestamp('2018-12-31 00:00:00'): 681, Timestamp('2019-01-31 00:00:00'): 806, Timestamp('2019-02-28 00:00:00'): 1083, Timestamp('2019-03-31 00:00:00'): 1338, Timestamp('2019-04-30 00:00:00'): 1293, Timestamp('2019-05-31 00:00:00'): 1926, Timestamp('2019-06-30 00:00:00'): 700, Timestamp('2019-07-31 00:00:00'): 322, Timestamp('2019-08-31 00:00:00'): 298, Timestamp('2019-09-30 00:00:00'): 932, Timestamp('2019-10-31 00:00:00'): 2211, Timestamp('2019-11-30 00:00:00'): 1611, Timestamp('2019-12-31 00:00:00'): 892, Timestamp('2020-01-31 00:00:00'): 1189, Timestamp('2020-02-29 00:00:00'): 7015, Timestamp('2020-03-31 00:00:00'): 2609, Timestamp('2020-04-30 00:00:00'): 6072, Timestamp('2020-05-31 00:00:00'): 9651, Timestamp('2020-06-30 00:00:00'): 13114}}
train_data = pd.DataFrame.from_dict(train_data_dict)
test_data_dict = {'Count of Participants': {Timestamp('2020-07-31 00:00:00'): 16693, Timestamp('2020-08-31 00:00:00'): 14797, Timestamp('2020-09-30 00:00:00'): 7066, Timestamp('2020-10-31 00:00:00'): 11157, Timestamp('2020-11-30 00:00:00'): 5737, Timestamp('2020-12-31 00:00:00'): 11147, Timestamp('2021-01-31 00:00:00'): 14031, Timestamp('2021-02-28 00:00:00'): 1847, Timestamp('2021-03-31 00:00:00'): 6549, Timestamp('2021-04-30 00:00:00'): 14614, Timestamp('2021-05-31 00:00:00'): 8315, Timestamp('2021-06-30 00:00:00'): 4372}}
test_data = pd.DataFrame.from_dict(test_data_dict)
full_data_dict = {'Count of Community Participants': {Timestamp('2016-07-31 00:00:00'): 3349, Timestamp('2016-08-31 00:00:00'): 401, Timestamp('2016-09-30 00:00:00'): 314, Timestamp('2016-10-31 00:00:00'): 473, Timestamp('2016-11-30 00:00:00'): 1415, Timestamp('2016-12-31 00:00:00'): 2351, Timestamp('2017-01-31 00:00:00'): 1834, Timestamp('2017-02-28 00:00:00'): 1924, Timestamp('2017-03-31 00:00:00'): 1291, Timestamp('2017-04-30 00:00:00'): 2737, Timestamp('2017-05-31 00:00:00'): 2919, Timestamp('2017-06-30 00:00:00'): 1098, Timestamp('2017-07-31 00:00:00'): 3032, Timestamp('2017-08-31 00:00:00'): 1973, Timestamp('2017-09-30 00:00:00'): 1196, Timestamp('2017-10-31 00:00:00'): 1611, Timestamp('2017-11-30 00:00:00'): 832, Timestamp('2017-12-31 00:00:00'): 768, Timestamp('2018-01-31 00:00:00'): 3051, Timestamp('2018-02-28 00:00:00'): 1100, Timestamp('2018-03-31 00:00:00'): 1606, Timestamp('2018-04-30 00:00:00'): 526, Timestamp('2018-05-31 00:00:00'): 808, Timestamp('2018-06-30 00:00:00'): 788, Timestamp('2018-07-31 00:00:00'): 5040, Timestamp('2018-08-31 00:00:00'): 304, Timestamp('2018-09-30 00:00:00'): 1709, Timestamp('2018-10-31 00:00:00'): 479, Timestamp('2018-11-30 00:00:00'): 1884, Timestamp('2018-12-31 00:00:00'): 681, Timestamp('2019-01-31 00:00:00'): 806, Timestamp('2019-02-28 00:00:00'): 1083, Timestamp('2019-03-31 00:00:00'): 1338, Timestamp('2019-04-30 00:00:00'): 1293, Timestamp('2019-05-31 00:00:00'): 1926, Timestamp('2019-06-30 00:00:00'): 700, Timestamp('2019-07-31 00:00:00'): 322, Timestamp('2019-08-31 00:00:00'): 298, Timestamp('2019-09-30 00:00:00'): 932, Timestamp('2019-10-31 00:00:00'): 2211, Timestamp('2019-11-30 00:00:00'): 1611, Timestamp('2019-12-31 00:00:00'): 892, Timestamp('2020-01-31 00:00:00'): 1189, Timestamp('2020-02-29 00:00:00'): 7015, Timestamp('2020-03-31 00:00:00'): 2609, Timestamp('2020-04-30 00:00:00'): 6072, Timestamp('2020-05-31 00:00:00'): 9651, Timestamp('2020-06-30 00:00:00'): 13114, Timestamp('2020-07-31 00:00:00'): 16693, Timestamp('2020-08-31 00:00:00'): 14797, Timestamp('2020-09-30 00:00:00'): 7066, Timestamp('2020-10-31 00:00:00'): 11157, Timestamp('2020-11-30 00:00:00'): 5737, Timestamp('2020-12-31 00:00:00'): 11147, Timestamp('2021-01-31 00:00:00'): 14031, Timestamp('2021-02-28 00:00:00'): 1847, Timestamp('2021-03-31 00:00:00'): 6549, Timestamp('2021-04-30 00:00:00'): 14614, Timestamp('2021-05-31 00:00:00'): 8315, Timestamp('2021-06-30 00:00:00'): 4372}}
full_data = pd.DataFrame.from_dict(full_data_dict)
Training model - Exponential Smoothing Holt Winters' Additive Method
train_test_model = ExponentialSmoothing(train_data, trend='add', damped_trend=False, seasonal='add').fit(smoothing_level=None, smoothing_trend=None, smoothing_seasonal=None)
Print the Train Model Summary
print("TRAIN MODEL SUMMARY")
print(train_test_model.summary())
Retrieving the train model's parameters
trend = train_test_model.model.trend # add or mul
seasonal = train_test_model.model.seasonal # Would be None if it isn't, otherwise 'add' or 'mul'
smoothing_level = train_test_model.params['smoothing_level']
smoothing_trend = train_test_model.params['smoothing_trend']
damped_trend = False
damping_trend = train_test_model.params['damping_trend'] # None since damped_trend is set to False
smoothing_seasonal = train_test_model.params['smoothing_seasonal']
initial_level = train_test_model.params['initial_level']
initial_trend = train_test_model.params['initial_trend']
initial_seasons = train_test_model.params['initial_seasons']
if damping_trend:
damped_trend = True
Fitting the model on entire dataset using train model parameters
model = ExponentialSmoothing(full_data, trend=trend, damped_trend=damped_trend, seasonal=seasonal, initialization_method='known', initial_level= initial_level, initial_trend=initial_trend, initial_seasonal=initial_seasons, seasonal_periods=12).fit(smoothing_level=smoothing_level, smoothing_trend=smoothing_trend, damping_trend = damping_trend, smoothing_seasonal=smoothing_seasonal, optimized=False, method=None)
forecast = model.forecast(60)
forecast = pd.DataFrame({'Count of Participants': forecast.copy()})
Print the Forecasting Model Summary
print("MODEL FOR FORECASTING SUMMARY")
print(model.summary())
print('Forecasts:\n', forecast)

Timestampsis undefined. If you ensure your code is working on its own in a new Python console, you may get more replies out of people who rarely use Python and don't want to figure out what to import to make the code work: a Minimal Working Example. – Stephan Kolassa Nov 15 '22 at 15:41