Why do we use a weighted sum in an artificial neuron instead of another more complex function?

Question

I have just started learning about NN and DL and I wanted to know if there is a theoretical reason we use a weighted sum for all the inputs in an artificial neuron. So for example if we have a neuron with two inputs which have weight $w_1$ and $w_2$, why don't we use a function like $x_1^2\times w_1 + x_2\times w_2^2$?

Please check the explanation given in the link: https://stats.stackexchange.com/questions/291680/can-any-one-explain-why-dot-product-is-used-in-neural-network-and-what-is-the-in — le4m, Mar 22 '22 at 12:06
Can you please edit your post to format the math symbols and equations with mathjax, to improve the readability? — nbro, Mar 22 '22 at 12:11

score 0 · Answer 1 · answered Mar 22 '22 at 11:51

There are tons of "neurons" as @CuCaRot has pointed out. Each neuron type will have its corresponding operation to get the neuron output. We have Convolutional, Recurrent, Polynomial, Transformers...

But to answer your specific question we usually do Multiply + Add function because they are highly optimized in dedicated hardware. That is why all sort of complex neurons tends to be split up in Multiply + Add modules.

When you start worrying about network efficiency you will start to see the term FLOPs (Floating Point Operations) and MACs (Multiply Add Cumulate) operations. Because in some hardware the MACs operations can be executed very fast counting as if it were a single operation. More info here.

Also worth mentioning the universal approximation theorem, that in this context says that combining simple linear operations with a non-linear activation and combining into at least 2 layers is already complex enough to get the job done. Further complexity needs to be justified - perhaps because it directly models something known to be useful. — Neil Slater, Apr 21 '22 at 12:14

Why do we use a weighted sum in an artificial neuron instead of another more complex function?

1 Answers1