ttnn.experimental.gelu_bw

ttnn.experimental.gelu_bw(grad_tensor: ttnn.Tensor, input_tensor: ttnn.Tensor, *, approximate: str | None, memory_config: ttnn.MemoryConfig = None, input_grad: ttnn.Tensor = None) ttnn.Tensor

Applies the backward pass of the GELU function using ttnn experimental kernels.

Parameters:
  • grad_tensor (ttnn.Tensor) – The input gradient tensor.

  • input_tensor (ttnn.Tensor) – The input tensor.

Keyword Arguments:
  • approximate (str, optional) – “tanh” or “none” (default). The gelu approximation algorithm to use.

  • memory_config (ttnn.MemoryConfig, optional) – Memory configuration for this operation. Defaults to None.

  • input_grad (ttnn.Tensor, optional) – Preallocated output tensor. Defaults to None.

Returns:

ttnn.Tensor – The output tensor.

Note

Supported dtypes, layouts, and ranks:

Dtypes - Layouts - Ranks

BFLOAT16 - TILE - 2, 3, 4

Example

>>> grad_tensor = ttnn.from_torch(
...     torch.tensor([[1, 2], [3, 4]], dtype=torch.bfloat16),
...     layout=ttnn.TILE_LAYOUT, device=device
... )
>>> input_tensor = ttnn.from_torch(
...     torch.tensor([[1, 2], [3, 4]], dtype=torch.bfloat16, requires_grad=True),
...     layout=ttnn.TILE_LAYOUT, device=device
... )
>>> output = ttnn.experimental.gelu_bw(grad_tensor, input_tensor)