Simply better in any way to the sigmoid function.
Advantages:
- Outputs a value between -1 and 1
- Zero centered which is great for parameter optimization in the following layers
- output close/likely to be normalized
Data Normalization + Centering
{python} torch.nn.Tanh()