ttnn.MatmulMultiCoreReuseMultiCastDRAMShardedProgramConfig

class ttnn.MatmulMultiCoreReuseMultiCastDRAMShardedProgramConfig

Bases: pybind11_object

This program config is a specialized config for very narrow tensors stored in DRAM.

from_json(self: str) ttnn._ttnn.operations.matmul.MatmulMultiCoreReuseMultiCastDRAMShardedProgramConfig
property fused_activation

Optional fused activation function to apply during computation.

If specified, the activation function is applied directly during the DRAM-sharded matmul operation. This can provide significant performance benefits by avoiding additional memory round-trips in DRAM-based operations.

property in0_block_w

Block width for both input tensors along the K dimension (shared inner dimension).

Determines the data granularity by specifying how many tiles wide each block is along the inner dimension for both input_tensor_a and input_tensor_b in DRAM-sharded operations. This parameter must be chosen to align with the DRAM sharding strategy and optimize memory bandwidth utilization for both tensors.

property per_core_M

Number of output tiles each core processes along the M dimension.

Determines how the M dimension is distributed across cores in DRAM-sharded scenarios. This must align with the DRAM sharding pattern to ensure optimal performance and avoid memory access conflicts.

property per_core_N

Number of output tiles each core processes along the N dimension.

Determines how the N dimension is distributed across cores in DRAM-sharded scenarios. This parameter affects the multicast efficiency and must be compatible with the DRAM sharding configuration.

to_json(self: ttnn._ttnn.operations.matmul.MatmulMultiCoreReuseMultiCastDRAMShardedProgramConfig) str