ttnn.ema

ttnn.ema(input: ttnn.Tensor, alpha: float, *, out: ttnn.Tensor | None, core_grid: ttnn.CoreGrid | None, memory_config: ttnn.MemoryConfig = input tensor memory config, compute_kernel_config: ttnn.ComputeKernelConfig = None) ttnn.Tensor

ttnn.ema(input: ttnn.Tensor, alpha: float, out: Optional[ttnn.Tensor] = None, memory_config: Optional[ttnn.MemoryConfig] = None) -> ttnn.Tensor

Returns the exponential moving average of input along the last dimension.

For a given input of size T along the last dimension, the output will also contain T elements and be such that:

\[\mathrm{{output}}_t = \alpha \times \mathrm{{output}}_{t-1} + (1 - \alpha) \times \mathrm{{input}}_t\]

with mathrm{{output}}_0 = mathrm{{input}}_0

Parameters:
  • input (ttnn.Tensor) – input tensor of shape [1, B, C, T]. Must be on the device.

  • alpha (float) – the smoothing factor, typically between 0 and 1.

Keyword Arguments:
  • out (ttnn.Tensor, optional) – preallocated output. If specified, out must have same shape as input, and must be on the same device.

  • core_grid (ttnn.CoreGrid, optional) – core grid for the operation. If not provided, an optimal core grid will be selected based on the input tensor shape.

  • memory_config (ttnn.MemoryConfig, optional) – memory configuration for the operation. Defaults to input tensor memory config.

  • compute_kernel_config (ttnn.ComputeKernelConfig, optional) – compute kernel configuration for the operation. Defaults to None.

Returns:

ttnn.Tensor – the output tensor.

Note

Supported dtypes, layouts, ranks:

Dtypes

Layouts

Ranks

BFLOAT16

TILE

4

The output tensor will be in TILE layout and BFLOAT16.

Memory Support:
  • Interleaved: DRAM and L1

Limitations:
  • Tensors should be on the device.

  • Preallocated output must have the same shape as the input

Example

# Create tensor
tensor_input = ttnn.rand((1, 2, 64, 128), device=device, layout=ttnn.TILE_LAYOUT)

# Apply ttnn.ema() with alpha=0.99
tensor_output = ttnn.ema(tensor_input, 0.99)
logger.info(f"EMA result: {tensor_output}")

# With preallocated output
preallocated_output = ttnn.rand([1, 2, 64, 128], dtype=ttnn.bfloat16, device=device, layout=ttnn.TILE_LAYOUT)
tensor_output = ttnn.ema(tensor_input, 0.99, out=preallocated_output)
logger.info(f"EMA with preallocated output result: {tensor_output}")