torch.nn.functional.gelu#

torch.nn.functional.gelu(input, approximate='none') → Tensor#

When the approximate argument is ‘none’, it applies element-wise the function $\text{GELU}(x) = x * \Phi(x)$

where $\Phi(x)$ is the Cumulative Distribution Function for Gaussian Distribution.

When the approximate argument is ‘tanh’, Gelu is estimated with

\text{GELU}(x) = 0.5 * x * (1 + \text{Tanh}(\sqrt{2 / \pi} * (x + 0.044715 * x^3)))

Docs