Output, Gated, and Special Activations: Softmax, GLU, SIREN, and More
Published:
Not every activation is a hidden-layer curve. Some produce probabilities, some implement learned gates, some shrink values toward zero, and some are designed for very specialized settings such as implicit neural representations.
