ttnn.addmm

ttnn.addmm(input_tensor: ttnn.Tensor, mat1_tensor: ttnn.Tensor, mat2_tensor: ttnn.Tensor, *, alpha: float, beta: float, memory_config: ttnn.MemoryConfig = None, dtype: ttnn.DataType = None, program_config: ttnn.MatmulProgramConfig = None, compute_kernel_config: ttnn.DeviceComputeKernelConfig = None, core_grid: ttnn.CoreGrid = None, output_tile: List of [int] = None, optional_output_tensor: ttnn.Tensor = None, global_cb: ttnn.GlobalCircularBuffer, sub_device_id: ttnn.SubDeviceId) ttnn.Tensor

Returns a matrix products of tensors mat1_tensor and mat2_tensor. Tensor input_tensor is added to the final result.

  • If mat1_tensor has shape (n, m) and mat2_tensor has shape (m, p), input_tensor needs to be of shape (n, p) and result will also be (n, p).

  • If optional_output_tensor is provided, it needs to be of shape (n, p) and result will be stored there; all previous content will be overwritten, reference to this object will also be returned.

  • Arguments alpha and beta are scaling factors, result calculation look like this:

    out = beta * input_tensor + alpha * (mat1_tensor @ mat2_tensor)

  • If beta is 0, then content of input_tensor is ignored.

  • Arguments beta and alpha should be real numbers;

Note

The tensors support the following data types and layouts:

input_tensor

dtype

layout

BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32

TILE

mat1_tensor

dtype

layout

BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32

TILE

mat2_tensor

dtype

layout

BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32

TILE

Parameters:
  • input_tensor (ttnn.Tensor) – tensor to be added to result of matrix multiplication of mat1_tensor and mat2_tensor

  • mat1_tensor (ttnn.Tensor) – the first tensor to be matrix multiplied

  • mat2_tensor (ttnn.Tensor) – the second tensor to be matrix multiplied

Keyword Arguments:
  • alpha (float) – multiplier for mat1_tensor @ mat2_tensor

  • beta (float) – multiplier for input_tensor

  • memory_config (ttnn.MemoryConfig, optional) – the memory configuration of the output tensor. Defaults to None, which will result in using ttnn.DRAM_MEMORY_CONFIG.

  • dtype (ttnn.DataType) – the data type of the output tensor. Supported types: ttnn.bfloat16, ttnn.float32, ttnn.bfloat8_b Defaults to None which means it will default to the highest precision of input_tensor, mat1_tensor or mat2_tensor.

  • program_config (ttnn.MatmulProgramConfig) – the program configuration for the matmul operation. Defaults to None.

  • compute_kernel_config (ttnn.DeviceComputeKernelConfig) – the compute kernel configuration for the matmul operation. Defaults to None.

  • core_grid (ttnn.CoreGrid) – the grid on which to distribute the sharded tensor on (writes to the cores L1s). Defaults to None.

  • output_tile (List of [int], optional) – Specifies the output tile configuration. Defaults to None.

  • optional_output_tensor (ttnn.Tensor, optional) – User-provided on-device output tensor where the result of matmul is to be written. Defaults to None.

  • global_cb (ttnn.GlobalCircularBuffer) – TBD

  • sub_device_id (ttnn.SubDeviceId) – TBD

Returns:

ttnn.Tensor – output tensor of shape (n, p)

Example

# Define input tensors
input_tensor = ttnn.rand((32, 32), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device)
tensor1 = ttnn.rand((32, 32), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device)
tensor2 = ttnn.rand((32, 32), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device)

# Perform addmm operation
output = ttnn.addmm(input_tensor, tensor1, tensor2, beta=1.0, alpha=1.0)
logger.info(f"Output tensor shape: {output.shape}")  # Output tensor shape: Shape([10, 64, 128])