I trained a model on initial data, got some good scores, and now after receiving more data I want to load the pre-trained model and continue to train.
Here some snippet of what I did:
(1) I follow this post which says to save the model in 'tf' format
# saving initial model
model.save(path2initial, save_format='tf')
# load pre-trained model
clf = tf.keras.models.load_model(path2initial)
# create new data generators
train_gen = generators.create(generator_config, 'train')
val_gen = generators.create(generator_config, 'val')
# create metrics, loss, optimizer and callbacks
loss = losses.create(loss_config)
callback_list = callbacks.create(callback_config)
optimizer = optimizers.create(optimizer_confing)
metrics = metrics.create(metrics_config)
# compile model
clf.compile(optimizer=optimizer, loss=loss, metrics=metrics)
# train
clf.fit(x=train_gen,
epochs=NB_EPCOHS,
validation_data=val_gen,
steps_per_epochs=math.ceil(len(train_steps)/ BATCH_SIZE),
validation_steps=math.ceil(len(val_steps)/ BATCH_SIZE),
callbacks=callback_list,
use_multiprocessing=True,
workers=16,
max_queue_size=8,
verbose=1
)
I should note that two of my callbacks are
EarlyStopping(monitor='val_loss', restore_best_weights=True,
min_delta=0.001, patience=10, mode='min', verbose=1)
ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True,
mode='min', save_freq='epoch')
And that train_gen is consisted with both initial_data and new_data.
This method trained only for 4 epochs, and hadn't changed for the rest of the 10 'patient' epochs. Moreover the results were way worse than the initial model's results.
(2) The second method I tried was to save the model in the default format (that's the only change):
model.save(path2initial)
.
.
.
This model had trained for 71/200 epochs, but it seems that it ignored my EarlyStopping() callback. In some epochs the val-loss had changed by 1e-4 or even less, and still it continue with the training (weirddd), And it stopped (by EarlyStopping()) in epoch 71 even that the val-loss had change! Moreover, the results had barely changed.
For comparison I trained a model from scratch on all the data (both initial and new data) and got way better results:
Initial data Method (1) Method (2) New model on all data
mean F1 score: 0.735 0.422 0.74 0.803
Is there a proven way to how to continue training a keras model?
When loading the model does the optimizer status reset?
When loading the model, do I need to define all the callbacks, loss, opt, metrics all over again? Do I need to compile it again?