Can't understand Output shape of a Dense layer - keras

Question

I am following an online tutorial to classify images and started off with dense layers as a starting point to classify cifar10 data.

# Create a model and add layers
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(32, 32, 3)))
model.add(Dense(10, activation='softmax'))

# Print summary
model.summary()

dense_1 output shape (None, 32, 32, 512). 'None' represents the batch size, but what does '32,32' represent? Why isn't the shape (None, 512)? Same happens with the dense_2 layer.

Can someone explain it here or point me to a resource that explains this?

This Stack Overflow answer to this question may help you understand the concepts of units, input and output shapes, in Keras layers. — LoLo Ram, May 29 '19 at 07:26

score 5 · Answer 1 · answered Oct 03 '19 at 13:16

Keras is applying the dense layer to each position of the image, acting like a 1x1 convolution.

More precisely, you apply each one of the 512 dense neurons to each of the 32x32 positions, using the 3 colour values at each position as input. That's why you have 512*3 (weights) + 512 (biases) = 2048 parameters.

As a consequence, for each neuron in each position you generate an output, and that's why you have 512 outputs for each position, i.e., 32x32x512.

This all happens for each example in the batch: that's the None part.

score 5 · Answer 2 · answered Apr 22 '20 at 04:34

So to my understanding, Dense is pretty much Keras's way to say matrix multiplication.

SUMMARY:

Whenever we say Dense(512, activation='relu', input_shape=(32, 32, 3)), what we are really saying is Perform matrix multiplication to result in an output matrix with a desired last dimension to be 512.

What gets lost in translation is that the 512 is just ONE part of the desired output, not the whole picture. Keras sees the input shape and the Dense shape and automagically figures out that you want to perform the matrix multiplication.

EXAMPLE 1:

Let's look at Dense(512, activation='relu', input_shape=(32, 32, 3)).

Matrix multiplication:

(None, 32, 32, 3) * (3, 512)

EXPLANATION:

None is the number of pictures determined at model training, so it doesn't matter right now.
(..., 32, 32, 3) is the input_shape specified in the Dense(...)
(3, 512) comes from Keras seeing that you have the last dimension as a (..., ..., ..., 3) as your input_shape. So Keras takes that last 3 and combines that with the 512 to result in the final shape of (3, 512). Taa-daa, automagic explained.

Results in:

(None, 32, 32, 512)

This is because those two 3s cancel each other out because of the matrix multiplication.

The Param # comes from (3 * 512) + 512 = 2048 as pointed out by grovina's answer. This is because of this equation:

input * weights + bias

input would be the 3 (aka number of params per neuron)
weights would be the 512 (aka number of neurons)
bias would be the 512 (aka one bias per neuron)

EXAMPLE 2:

Let's do the same with the Dense(10, activation='softmax').

Matrix multiplication:

(None, 32, 32, 512) * (512, 10)

EXPLANATION:

None doesn't matter right now.
(..., 32, 32, 512) is the input_shape that comes from the first Dense(...)
(512, 10) comes from the last dimension of the input_shape and the 10 specified in the second Dense(...)

Results in:

(None, 32, 32, 10)

The two 512s cancel out.

Param # is (512 * 10) + 10 = 5130

input would be the 512 (aka number of params per neuron)
weights would be the 10 (aka number of neurons)
bias would be the 10 (aka one bias per neuron)

score 2 · Answer 3 · edited Aug 09 '20 at 08:47

Right. What are the attributes of your input layer? Unless I am mistaken, you have not flattened your image at all. So you have sent a 32 x 32 image directly to a dense layer, and not a convolutional layer. (As a side note, it could be that you are following a very bad tutorial, as this is not what good tutorials do).

What you can do instead, if you are trying to compare the performance of a network with out convolutions to one with convolutions, is flatten the images.

Then, run these two dense layers. Can you share which tutorial you are following? When I studied DNNs in school, we used this online textbook.

It was written before Keras (or Tensor-flow, MXNet, anything) so it does not have Keras examples. But, between the Keras documentation, this book, and this forum, you should be able to grasp Deep Learning.

If you flatten the images, doesn't this mean that 2D Conv layers wont work properly? — Worthy7, Mar 08 '19 at 13:34

Can't understand Output shape of a Dense layer - keras

3 Answers3