2
8 x 1 x 8

Lets say there are 8 classes. The network has one hidden layer with only one neuron in it. Can the neuron learn to map the identity function between the 8 inputs and 8 outputs? Will gradient descent learn the weights of the identity? What is the difference between this network and a softmax classifier?

Char
  • 241
  • 3
  • 6
  • c.f. https://stats.stackexchange.com/questions/500948/is-logistic-regression-a-specific-case-of-a-neural-network – patagonicus Apr 13 '22 at 12:13

1 Answers1

1

If you didn't have the 1-hidden layer, then the mapping between the input and the output would be a single-layer neural network. No, your network wouldn't learn to map the identity function between the 8 inputs and 8 outputs unless you work with Autoencoders. This network is indeed the softmax classifier except that we have 8 output classes. It is no longer a network of complex non-linear computations except that the non-linearity stems from the single neuron that we have.