ttnn.max

ttnn.max(input_a: ttnn.Tensor, dim: number, keepdim: bool = False, *, memory_config: ttnn.MemoryConfig = None, compute_kernel_config: ttnn.ComputeKernelConfig = None, scalar: float = 1.0, correction: bool = True, sub_core_grids: ttnn.CoreRangeSet = None) ttnn.Tensor

Computes the max of the input tensor input_a along the specified dimension dim. If no dimension is provided, max is computed over all dimensions yielding a single value.

Parameters:
  • input_a (ttnn.Tensor) – the input tensor. Must be on the device.

  • dim (number) – dimension value to reduce over.

  • keepdim (bool, optional) – keep original dimension size. Defaults to False.

Keyword Arguments:
  • memory_config (ttnn.MemoryConfig, optional) – Memory configuration for the operation. Defaults to None.

  • compute_kernel_config (ttnn.ComputeKernelConfig, optional) – Compute kernel configuration for the operation. Defaults to None.

  • scalar (float, optional) – A scaling factor to be applied to the input tensor. Defaults to 1.0.

  • correction (bool, optional) – Applies only to ttnn.std() - whether to apply Bessel’s correction (i.e. N-1). Defaults to True.

  • sub_core_grids (ttnn.CoreRangeSet, optional) – Subcore grids to use for the operation. Defaults to None, which will use all cores.

Returns:

ttnn.Tensor – the output tensor.

Note

The input tensor supports the following data types and layouts:

Input Tensor

dtype

layout

FLOAT32

ROW_MAJOR, TILE

BFLOAT16

ROW_MAJOR, TILE

BFLOAT8_B

ROW_MAJOR, TILE

The output tensor will be in TILE layout and have the same dtype as the input_tensor

Memory Support:
  • Interleaved: DRAM and L1

  • Sharded (L1): Width, Height, and ND sharding

  • Output sharding will mirror the input

Example

# Create tensor
tensor_input = ttnn.rand((2, 3, 4), device=device)

# Apply ttnn.max() on dim=1
tensor_output = ttnn.max(tensor_input, dim=1)
logger.info(f"Max result: {tensor_output}")