Are there any commonly used activation functions (e.g. that take values in $(0,.5)\cup (.5,1)$)? Preferably for classification?
Why? I was looking for commonly used activation functions on Google, and I noticed that all activation functions are continuous. However, I believe this is not needed in Hornik's paper.
When I did a bit of testing myself, with a discontinuous activation function on the MNIST dataset, the results were good. So I was curious if anyone else used this kind of activation function.