ttnn.linear
- ttnn.linear(input_tensor_a: ttnn.Tensor, input_tensor_b: ttnn.Tensor, *, bias: ttnn.Tensor = None, transpose_a: bool = False, transpose_b: bool = False, memory_config: ttnn.MemoryConfig = None, dtype: ttnn.DataType = None, program_config: MatmulProgramConfig = None, activation: str or ttnn.UnaryWithParam = None, compute_kernel_config: ttnn.DeviceComputeKernelConfig = None, core_grid: ttnn.CoreGrid = None, output_tile: List of [int] = None, optional_output_tensor: ttnn.Tensor = None) ttnn.Tensor
-
Returns the linear transformation of the inputs.
The limitations and behaviours are the same as for matmul.
Note
The tensors support the following data types and layouts:
input_tensor_a dtype
layout
BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32
TILE
input_tensor_b dtype
layout
BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32
TILE
bias dtype
layout
BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32
TILE
- Parameters:
-
input_tensor_a (ttnn.Tensor) – the first tensor to be multiplied. Needs to be on the device.
input_tensor_b (ttnn.Tensor) – the second tensor to be multiplied. Needs to be on the device.
- Keyword Arguments:
-
bias (ttnn.Tensor, optional) – the bias tensor to be added. If specified, needs to be on the device. Defaults to None.
transpose_a (bool, optional) – Whether to transpose input_tensor_a. Defaults to False.
transpose_b (bool, optional) – Whether to transpose input_tensor_b. Defaults to False.
memory_config (ttnn.MemoryConfig, optional) – the memory configuration of the output tensor. Defaults to None, which will result in using ttnn.DRAM_MEMORY_CONFIG.
dtype (ttnn.DataType, optional) – the data type of the output tensor. Defaults to None.
program_config (MatmulProgramConfig, optional) – the program configuration for the matmul operation. Defaults to None.
activation (str or ttnn.UnaryWithParam, optional) – the activation function to be applied. Defaults to None. When using sharded tensors, the
fused_activationparameter of theprogram_configshould be used instead.compute_kernel_config (ttnn.DeviceComputeKernelConfig, optional) – the compute kernel configuration for the matmul operation. Defaults to None.
core_grid (ttnn.CoreGrid, optional) – the grid on which to distribute the sharded tensor on (writes to the cores L1s). Defaults to None.
output_tile (List of [int], optional) – Specifies the output tile configuration. Defaults to None.
optional_output_tensor (ttnn.Tensor, optional) – User provided on-device output tensor where the result of linear is to be written. Defaults to None.
- Returns:
-
ttnn.Tensor – the output tensor.
Example
# Define input tensors activations = ttnn.rand((10, 64, 32), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device) weight = ttnn.rand((32, 128), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device) bias = ttnn.rand((128,), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device) # Perform linear transformation output = ttnn.linear(activations, weight, bias=bias) logger.info(f"Output tensor shape: {output.shape}") # Output tensor shape: Shape([10, 64, 128])