I've been looking into the history of artificial neural networks, and only recently learned that the original Mark 1 Perceptron was only a single layer network. It would iteratively modify the weights of connections using weights from the image sensors directly (i.e. pixel weights).
My question is: Wouldn't this algorithm really just produce connection weights that resemble the "average image" for each class?
I.e. say you used the Mark 1 Perceptron to differentiate between three different classes of image (A, B and C). Would the connection weights for class A not just produce an "average image" for class A, and likewise the connection weights for class C would just resemble an "average image" for class C? In this sense, each of the connection groups would resemble the "average image" it's attempting to classify, and the output class is really just the class whose connection weights are most similar to the input image? I found this intuition a useful way of understanding how the simple perceptron works, but I want to make sure it's a valid interpretation.