ttnn.linear

ttnn.linear(input_tensor_a: ttnn.Tensor, input_tensor_b: ttnn.Tensor, *, bias: ttnn.Tensor = None, transpose_a: bool = False, transpose_b: bool = False, memory_config: ttnn.MemoryConfig = None, dtype: ttnn.DataType = None, program_config: MatmulProgramConfig = None, activation: str or ttnn.UnaryWithParam = None, compute_kernel_config: ttnn.DeviceComputeKernelConfig = None, core_grid: ttnn.CoreGrid = None, output_tile: List of [int] = None, optional_output_tensor: ttnn.Tensor = None) ttnn.Tensor

Returns the linear transformation of the inputs.

The limitations and behaviours are the same as for matmul.

Note

The tensors support the following data types and layouts:

input_tensor_a

dtype

layout

BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32

TILE

input_tensor_b

dtype

layout

BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32

TILE

bias

dtype

layout

BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32

TILE

Parameters:
  • input_tensor_a (ttnn.Tensor) – the first tensor to be multiplied. Needs to be on the device.

  • input_tensor_b (ttnn.Tensor) – the second tensor to be multiplied. Needs to be on the device.

Keyword Arguments:
  • bias (ttnn.Tensor, optional) – the bias tensor to be added. If specified, needs to be on the device. Defaults to None.

  • transpose_a (bool, optional) – Whether to transpose input_tensor_a. Defaults to False.

  • transpose_b (bool, optional) – Whether to transpose input_tensor_b. Defaults to False.

  • memory_config (ttnn.MemoryConfig, optional) – the memory configuration of the output tensor. Defaults to None, which will result in using ttnn.DRAM_MEMORY_CONFIG.

  • dtype (ttnn.DataType, optional) – the data type of the output tensor. Defaults to None.

  • program_config (MatmulProgramConfig, optional) – the program configuration for the matmul operation. Defaults to None.

  • activation (str or ttnn.UnaryWithParam, optional) – the activation function to be applied. Defaults to None. When using sharded tensors, the fused_activation parameter of the program_config should be used instead.

  • compute_kernel_config (ttnn.DeviceComputeKernelConfig, optional) – the compute kernel configuration for the matmul operation. Defaults to None.

  • core_grid (ttnn.CoreGrid, optional) – the grid on which to distribute the sharded tensor on (writes to the cores L1s). Defaults to None.

  • output_tile (List of [int], optional) – Specifies the output tile configuration. Defaults to None.

  • optional_output_tensor (ttnn.Tensor, optional) – User provided on-device output tensor where the result of linear is to be written. Defaults to None.

Returns:

ttnn.Tensor – the output tensor.

Example

# Define input tensors
activations = ttnn.rand((10, 64, 32), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device)
weight = ttnn.rand((32, 128), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device)
bias = ttnn.rand((128,), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device)
# Perform linear transformation

output = ttnn.linear(activations, weight, bias=bias)
logger.info(f"Output tensor shape: {output.shape}")  # Output tensor shape: Shape([10, 64, 128])