Losses of keras CNN model is not decreasing

Question

I am working on Street view house numbers dataset using CNN in Keras on tensorflow backend. I have queries regarding why loss of network is not decreasing, I have doubt whether I am using correct loss function or not. First I preprocess dataset so my train and test dataset shapes are:

('Training set and labels', (26721,32,32,1), (26721, 6) )
('Validation set and labels ', (6680,32,32,1), (6680,6) )
('Test set and labels', (13068,32,32,1), (13068,6) )

Reason why labels array has 6 columns, because maximum digits in one image is 6. For suppose, if image has 2 digits "1 and 2", then labels array is considered as [1,2,10,10,10,10] , where 10 represents no digits.

Now this is my model definition:

def modelCNN(input_shape, num_classes):

  model = Sequential()

  model.add(Convolution2D(16, 7, 4, border_mode='same',
                    input_shape=input_shape))
  model.add(PReLU())
  model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2), border_mode='same'))
  model.add(BatchNormalization())

  model.add(Convolution2D(32, 5, 3, border_mode='same'))
  model.add(PReLU())
  model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2), border_mode='same'))
  model.add(BatchNormalization())

  model.add(Convolution2D(64, 3, 3, border_mode='same'))
  model.add(PReLU())
  model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), border_mode='same'))

  model.add(Flatten())
  model.add(Dense(512))
  model.add(PReLU())
  model.add(Dropout(0.5))
  model.add(Dense(num_classes))
  model.add(Activation('softmax'))

return model

I tried with different customs model, but nothing seems working. I do compilation, loss and optimization like this:

model.compile(optimizer='adam', loss='categorical_crossentropy',
                                    metrics=['accuracy'])

print(model.summary())
csv_logger = CSVLogger('training.log')
early_stop = EarlyStopping('val_acc', patience=200, verbose=1)
model_checkpoint = ModelCheckpoint(model_save_path,
                                    'val_acc', verbose=0,
                                    save_best_only=True)

model_callbacks = [early_stop, model_checkpoint, csv_logger]
K.get_session().run(tf.global_variables_initializer())

model.fit_generator(train,
          samples_per_epoch=np.ceil(len(train_dataset)/batch_size),
          epochs=num_epochs,
          verbose=1,
          validation_data=valid,
      validation_steps=batch_size,
          callbacks=model_callbacks)

Now, What I think that categorical_crossentropy loss needs one hot vector representation, but I am not sure how should i convert my labels array. Because unlike Mnist dataset, I have multiple labels in each image.

Is there any loss function that I can use? or Any suggestion to make things work?

Note: I am using CNN just for classification only, not for detection.

Thanks very much in advance

score 2 · Answer 1 · answered Dec 25 '17 at 15:54

2

you can try to remove some of the irrelevant features that may be the reason or you could try different solvers and activation methods.I previously answered similar question earlier in the following linklink

answered Dec 25 '17 at 15:54

Sampath Madala

157
1
7

this does not help in my case. do you think categorical_crossentropy is fine to use in my case? – Ramesh Kumar Dec 26 '17 at 00:01
check out the link it is explained in detail link – Sampath Madala Dec 26 '17 at 01:45
Try tuning hyper parameters – Sampath Madala Dec 26 '17 at 01:49

score 1 · Answer 2 · answered Nov 26 '19 at 18:38

1

did you try changing the loss function to sparse_categorical_crossentropy? As your vector consists of integers and is not One-hot-encoded, using categorical_crossentropy is an issue as it expects the labels to follow a categorical encoding.

answered Nov 26 '19 at 18:38

J-H

111
2

Losses of keras CNN model is not decreasing

2 Answers2