ttnn.cumsum
- ttnn.cumsum(input: ttnn.Tensor, dim: int, *, dtype: ttnn.DataType | None, reverse_order: bool, optional, default False, out: ttnn.Tensor | None) ttnn.Tensor
-
Returns cumulative sum of
inputalong dimensiondimFor a giveninputof size N, theoutputwill also contain N elements and be such that:\[\mathrm{{output}}_i = \mathrm{{input}}_1 + \mathrm{{input}}_2 + \cdots + \mathrm{{input}}_i\]- Parameters:
-
input (ttnn.Tensor) – input tensor. Must be on the device.
dim (int) – dimension along which to compute cumulative sum
- Keyword Arguments:
-
dtype (ttnn.DataType, optional) – desired output type. If specified then input tensor will be cast to dtype before processing.
reverse_order (bool, optional, default False) – whether to perform accumulation from the end to the beginning of accumulation axis.
out (ttnn.Tensor, optional) – preallocated output. If specified, out must have same shape as input, and must be on the same device.
- Returns:
-
ttnn.Tensor – the output tensor.
Note
If both
dtypeandoutputare specified thenoutput.dtypemust matchdtype.Supported dtypes, layout, ranks and dim values:
Dtypes
Layouts
Ranks
dim
BFLOAT16, FLOAT32, INT32, UINT32
TILE
1, 2, 3, 4, 5
-rank <= dim < rank
- Memory Support:
-
Interleaved: DRAM and L1
- Limitations:
-
Preallocated output must have the same shape as the input
Example
# Create tensor tensor_input = ttnn.rand((2, 3, 4), device=device) # Apply ttnn.cumsum() on dim=0 tensor_output = ttnn.cumsum(tensor_input, dim=0) # With preallocated output and dtype preallocated_output = ttnn.rand([2, 3, 4], dtype=ttnn.bfloat16, device=device) tensor_output = ttnn.cumsum(tensor_input, dim=0, dtype=ttnn.bfloat16, out=preallocated_output)