ttnn.sum
- ttnn.sum(input_tensor: ttnn.Tensor, dim: number or tuple, keepdim: bool = False, *, memory_config: ttnn.MemoryConfig = None, compute_kernel_config: ttnn.ComputeKernelConfig = None, scalar: float = 1.0, correction: bool | None, sub_core_grids: ttnn.CoreRangeSet = None) ttnn.Tensor
-
Computes the sum of the input tensor
input_tensoralong the specified dimension(s)dim. If no dimension is provided, sum is computed over all dimensions yielding a single value.- Parameters:
-
input_tensor (ttnn.Tensor) – the input tensor. Must be on the device.
dim (number or tuple) – dimension value(s) to reduce over.
keepdim (bool, optional) – keep the original dimension size(s). Defaults to False.
- Keyword Arguments:
-
memory_config (ttnn.MemoryConfig, optional) – Memory configuration for the operation. Defaults to None.
compute_kernel_config (ttnn.ComputeKernelConfig, optional) – Compute kernel configuration for the operation. Defaults to None.
scalar (float, optional) – A scaling factor to be applied to the input tensor. Defaults to 1.0.
correction (bool, optional) – Deprecated. This parameter is deprecated and will be removed in a future release. It has no impact on the result.
sub_core_grids (ttnn.CoreRangeSet, optional) – Subcore grids to use for the operation. Defaults to None, which will use all cores.
- Returns:
-
ttnn.Tensor – the output tensor.
Note
The input tensor supports the following data types and layouts:
Input Tensor dtype
layout
FLOAT32
ROW_MAJOR, TILE
BFLOAT16
ROW_MAJOR, TILE
BFLOAT8_B
TILE
The output tensor will be in TILE layout and have the same dtype as the
input_tensor.- Memory Support:
-
Interleaved: DRAM and L1
Sharded (L1): Width, Height, and ND sharding
Output sharding will mirror the input
Example
# Create tensor tensor_input = ttnn.rand((2, 3, 4), device=device) # Apply ttnn.sum() on dim=2 tensor_output = ttnn.sum(tensor_input, dim=2) logger.info(f"Sum result: {tensor_output}")