ttnn.linear

ttnn.linear = Operation(python_fully_qualified_name='ttnn.linear', function=<ttnn._ttnn.operations.matmul.linear_t object>, preprocess_golden_function_inputs=<function default_preprocess_golden_function_inputs>, golden_function=<function _golden_function>, postprocess_golden_function_outputs=<function default_postprocess_golden_function_outputs>, is_cpp_operation=True, is_experimental=False)

ttnn.linear(input_tensor_a: ttnn.Tensor, input_tensor_b: ttnn.Tensor, bias: Optional[ttnn.Tensor] = None, transpose_a: bool = False, transpose_b: bool = False, memory_config: Optional[ttnn.MemoryConfig] = None, dtype: Optional[ttnn.DataType] = None, program_config: Optional[ttnn.MatmulProgramConfig] = None, activation: Optional[str] = None, compute_kernel_config: Optional[ttnn.DeviceComputeKernelConfig] = None, core_grid: Optional[ttnn.CoreGrid] = None, output_tile: Optional[list[int]] = None, optional_output_tensor: Optional[ttnn.Tensor] = None, global_cb: Optional[ttnn.GlobalCircularBuffer] = None, sub_device_id: Optional[ttnn.SubDeviceId] = None) -> ttnn.Tensor

Returns the linear transformation of the inputs.

The limitations and behaviours are the same as for matmul.

Note

The tensors support the following data types and layouts:

input_tensor_a

dtype

layout

BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32

TILE

input_tensor_b

dtype

layout

BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32

TILE

bias

dtype

layout

BFLOAT8_B, BFLOAT4_B, BFLOAT16, FLOAT32

TILE

Parameters:
  • input_tensor_a (ttnn.Tensor) – the first tensor to be multiplied. Needs to be on the device.

  • input_tensor_b (ttnn.Tensor) – the second tensor to be multiplied. Needs to be on the device.

Keyword Arguments:
  • bias (ttnn.Tensor, optional) – the bias tensor to be added. If specified, needs to be on the device. Defaults to None.

  • transpose_a (bool, optional) – Whether to transpose input_tensor_a. Defaults to False.

  • transpose_b (bool, optional) – Whether to transpose input_tensor_b. Defaults to False.

  • memory_config (ttnn.MemoryConfig, optional) – the memory configuration of the output tensor. Defaults to None, which will result in using ttnn.DRAM_MEMORY_CONFIG.

  • dtype (ttnn.DataType, optional) – the data type of the output tensor. Defaults to None.

  • program_config (MatmulProgramConfig, optional) – the program configuration for the matmul operation. Defaults to None.

  • activation (str, optional) – the activation function to be applied. Defaults to None.

  • compute_kernel_config (ttnn.DeviceComputeKernelConfig, optional) – the compute kernel configuration for the matmul operation. Defaults to None.

  • core_grid (ttnn.CoreGrid, optional) – the grid on which to distribute the sharded tensor on (writes to the cores L1s). Defaults to None.

  • output_tile (List of [int], optional) – Specifies the output tile configuration. Defaults to None.

  • optional_output_tensor (ttnn.Tensor, optional) – User provided on-device output tensor where the result of linear is to be written. Defaults to None.

Returns:

ttnn.Tensor – the output tensor.

Example

>>> # batched matrix x broadcasted matrix
>>> activations = ttnn.to_device(ttnn.from_torch(torch.randn((10, 64, 32), dtype=torch.bfloat16)), device)
>>> weight = ttnn.to_device(ttnn.from_torch(torch.randn((32, 128), dtype=torch.bfloat16)), device)
>>> bias = ttnn.to_device(ttnn.from_torch(torch.randn((128,), dtype=torch.bfloat16)), device)
>>> output = ttnn.linear(activations, weight, bias=bias)
>>> print(output.shape)
[10, 64, 128]