I am building a convolutional neural network to classify certain types of greyscale images (chess puzzles).
I can see obvious advantages for using ReLU as my activation function e.g. I can treat the input layer feed to the first hidden layer in the same way as the input into any other hidden layer in terms of the domain of the input.
But I have a question about how to use ReLU activations at the other end - i.e. as the output of the fully connected output layer.
I am looking to classify images of chess boards and so it seemed to me obvious that I could classify a part of the board as being a white square or a black square and then with some other (optional, as it were) classifications e.g. with pawn which itself could be black or white.
I cannot really see how to do this with ReLU - and so that suggests that what I really have to do is implement every possible outcome as a separate output neuron e.g., empty black square, empty white square, white square with white pawn, white square with black pawn, black square with white pawn ... etc.
Is that correct?
(I originally asked this on Stack Overflow and got no answers, but a recommendation to ask here. So I have deleted the original and posted this. Since I originally asked I have built a network and, using random weights - learning is about to begin, see that the overall results, after several layers of leaky ReLU activated filters and a leaky ReLU fully connected layer, don't stray too far from the 0 - 1 range, though this may be luck more than anything else: but I am still intrigued by this problem so would welcome answers.)