ttkernel.acquire_dst
(tt::ttkernel::AcquireDstOp)
Aquire dest call.
Aquire dest operation
ttkernel.add
(tt::ttkernel::AddOp)
Add operation
Add operation
Operands:
Operand | Description |
---|---|
dst_index | 32-bit signless integer |
ttkernel.add_tiles_init
(tt::ttkernel::AddTilesInitOp)
Short init function
Must be run before add_tiles.
Operands:
Operand | Description |
---|---|
in0_cb | TTKernel cb |
in1_cb | TTKernel cb |
ttkernel.add_tiles
(tt::ttkernel::AddTilesOp)
Add operation
Performs element-wise addition C=A+B of tiles in two CBs at given indices and writes the result to the DST register at index dst_tile_index. The DST register buffer must be in acquired state via acquire_dst call. This call is blocking and is only available on the compute engine.
Operands:
Operand | Description |
---|---|
in0_cb | TTKernel cb |
in1_cb | TTKernel cb |
in0_tile_index | 32-bit signless integer |
in1_tile_index | 32-bit signless integer |
dst_index | 32-bit signless integer |
ttkernel.binary_op_init_common
(tt::ttkernel::BinaryOpInitCommonOp)
Init function for all binary ops
Followed by the specific init required with an opcode (binrary_op_specific_init).
Operands:
Operand | Description |
---|---|
in0_cb | TTKernel cb |
in1_cb | TTKernel cb |
out_cb | TTKernel cb |
ttkernel.builtin
(tt::ttkernel::BuiltinOp)
Builtin call.
Kernel operation
Attributes:
Attribute | MLIR Type | Description |
---|---|---|
op | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
kind | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
Operands:
Operand | Description |
---|---|
args | variadic of non-0-ranked.memref of any type values or TTKernel cb |
ttkernel.cb_pop_front
(tt::ttkernel::CBPopFrontOp)
CBPopFront call.
CBPopFront operation
Operands:
Operand | Description |
---|---|
cb | TTKernel cb |
numPages | 32-bit signless integer |
ttkernel.cb_push_back
(tt::ttkernel::CBPushBackOp)
CBPushBack call.
CBPushBack operation
Operands:
Operand | Description |
---|---|
cb | TTKernel cb |
numPages | 32-bit signless integer |
ttkernel.cb_reserve_back
(tt::ttkernel::CBReserveBackOp)
CBReserveBack call.
CBReserveBack operation
Operands:
Operand | Description |
---|---|
cb | TTKernel cb |
numPages | 32-bit signless integer |
ttkernel.cb_wait_front
(tt::ttkernel::CBWaitFrontOp)
CBWaitFront call.
CBWaitFront operation
Operands:
Operand | Description |
---|---|
cb | TTKernel cb |
numPages | 32-bit signless integer |
ttkernel.get_noc_addr
(tt::ttkernel::GetNocAddrOp)
GetNocAddr
GetNocAddr
Interfaces: InferTypeOpInterface
Operands:
Operand | Description |
---|---|
x | 32-bit signless integer |
y | 32-bit signless integer |
l1Address | 32-bit signless integer |
Results:
Result | Description |
---|---|
nocAddr | TTKernel noc address |
ttkernel.matmul
(tt::ttkernel::MatmulOp)
Matmul operation
Matmul operation
Operands:
Operand | Description |
---|---|
dst_index | 32-bit signless integer |
ttkernel.mul
(tt::ttkernel::MulOp)
Mul operation
Mul operation
Operands:
Operand | Description |
---|---|
dst_index | 32-bit signless integer |
ttkernel.mul_tiles_init
(tt::ttkernel::MulTilesInitOp)
Short init function
Must be run before mul_tiles.
Operands:
Operand | Description |
---|---|
in0_cb | TTKernel cb |
in1_cb | TTKernel cb |
ttkernel.mul_tiles
(tt::ttkernel::MulTilesOp)
Mul operation
Performs element-wise multiplication C=A*B of tiles in two CBs at given indices and writes the result to the DST register at index dst_tile_index. The DST register buffer must be in acquired state via acquire_dst call. This call is blocking and is only available on the compute engine.
Operands:
Operand | Description |
---|---|
in0_cb | TTKernel cb |
in1_cb | TTKernel cb |
in0_tile_index | 32-bit signless integer |
in1_tile_index | 32-bit signless integer |
dst_index | 32-bit signless integer |
ttkernel.noc_async_read_barrier
(tt::ttkernel::NocAsyncReadBarrierOp)
NocAsyncReadBarrier
NocAsyncReadBarrier
ttkernel.noc_async_read
(tt::ttkernel::NocAsyncReadOp)
NocAsyncRead
NocAsyncRead
Operands:
Operand | Description |
---|---|
srcNocAddr | TTKernel noc address |
dstLocalL1Addr | 32-bit signless integer |
size | 32-bit signless integer |
ttkernel.noc_async_write_barrier
(tt::ttkernel::NocAsyncWriteBarrierOp)
NocAsyncWriteBarrier
NocAsyncWriteBarrier
ttkernel.noc_async_write
(tt::ttkernel::NocAsyncWriteOp)
NocAsyncWrite
NocAsyncWrite
Operands:
Operand | Description |
---|---|
srcLocalL1Addr | 32-bit signless integer |
dstNocAddr | TTKernel noc address |
size | 32-bit signless integer |
ttkernel.pack
(tt::ttkernel::PackOp)
Pack op.
Pack operation
Operands:
Operand | Description |
---|---|
dst_index | 32-bit signless integer |
out_cb | TTKernel cb |
out_index | 32-bit signless integer |
ttkernel.pack_set_data_type
(tt::ttkernel::PackSetDataTypeOp)
Pack set DataType op.
Pack set DataType operation
Attributes:
Attribute | MLIR Type | Description |
---|---|---|
data_type | ::mlir::IntegerAttr | TT DataTypes{{% markdown %}}Enum cases: * f32 (`Float32`) * f16 (`Float16`) * bf16 (`BFloat16`) * bfp_f8 (`BFP_Float8`) * bfp_bf8 (`BFP_BFloat8`) * bfp_f4 (`BFP_Float4`) * bfp_bf4 (`BFP_BFloat4`) * bfp_f2 (`BFP_Float2`) * bfp_bf2 (`BFP_BFloat2`) * u32 (`UInt32`) * u16 (`UInt16`) * u8 (`UInt8`){{% /markdown %}} |
ttkernel.pack_tile
(tt::ttkernel::PackTileOp)
PackTile op.
Copies a single tile from the DST register buffer at a specified index to a specified CB at a given index. For the out_tile_index to be valid for this call, cb_reserve_back(n) has to be called first to reserve at least some number n > 0 of tiles in the output CB. out_tile_index = 0 then references the first tile in the reserved section of the CB, up to index n - 1, which will then be visible to the consumer in the same order after a cb_push_back call. The DST register buffer must be in acquired state via acquire_dst call. This call is blocking and is only available on the compute engine.
Each subsequent pack call will increment the write pointer in the cb by single tile size. The pointer is then again set to a valid position with space for n reserved tiles by another cb_reserve_back call.
Operates in tandem with functions cb_reserve_back and cb_push_back.
A typical use case is first the producer ensures that there is a number of tiles available in the buffer via cb_reserve_back, then the producer uses the pack_tile call to copy a tile from one of DST slots to a slot in reserved space and finally cb_push_back is called to announce visibility of the reserved section of the circular buffer to the consumer.
Operands:
Operand | Description |
---|---|
dst_index | 32-bit signless integer |
out_cb | TTKernel cb |
out_index | 32-bit signless integer |
ttkernel.release_dst
(tt::ttkernel::ReleaseDstOp)
Release dest call.
Release dest operation
ttkernel.return
(tt::ttkernel::ReturnOp)
Return op.
Return operation
Traits: AlwaysSpeculatableImplTrait
, ReturnLike
, Terminator
Interfaces: ConditionallySpeculatable
, NoMemoryEffect (MemoryEffectOpInterface)
, RegionBranchTerminatorOpInterface
Effects: MemoryEffects::Effect{}
ttkernel.sub
(tt::ttkernel::SubOp)
Sub operation
Sub operation
Operands:
Operand | Description |
---|---|
dst_index | 32-bit signless integer |
ttkernel.tile_regs_acquire
(tt::ttkernel::TileRegsAcquireOp)
Tile_regs_acquire
Acquire an exclusive lock on the DST register for the MATH thread. This register is an array of 16 tiles of 32x32 elements each. This is a blocking function, i.e. this function will wait until the lock is acquired.
ttkernel.tile_regs_commit
(tt::ttkernel::TileRegsCommitOp)
Tile_regs_commit
Release lock on DST register by MATH thread. The lock had to be previously acquired with tile_regs_acquire.
ttkernel.tile_regs_release
(tt::ttkernel::TileRegsReleaseOp)
Tile_regs_release
Release lock on DST register by PACK thread. The lock had to be previously acquired with tile_regs_wait.
ttkernel.tile_regs_wait
(tt::ttkernel::TileRegsWaitOp)
Tile_regs_wait
Acquire an exclusive lock on the DST register for the PACK thread. It waits for the MATH thread to commit the DST register. This is a blocking function, i.e. this function will wait until the lock is acquired.
ttkernel.tilize_block
(tt::ttkernel::TilizeBlockOp)
TilizeBlockOp call.
TilizeBlockOp operation
Operands:
Operand | Description |
---|---|
cbIn | TTKernel cb |
numTiles | 32-bit signless integer |
cbOut | TTKernel cb |
ttkernel.tilize_init
(tt::ttkernel::TilizeInitOp)
TilizeInitOp call.
TilizeInitOp operation
Operands:
Operand | Description |
---|---|
cbIn | TTKernel cb |
numTiles | 32-bit signless integer |
cbOut | TTKernel cb |
ttkernel.unpack_ab
(tt::ttkernel::UnpackABOp)
UnpackAB op.
UnpackAB operation
Operands:
Operand | Description |
---|---|
cb_a | TTKernel cb |
src_a_index | 32-bit signless integer |
cb_b | TTKernel cb |
src_b_index | 32-bit signless integer |
ttkernel.unpack_a
(tt::ttkernel::UnpackAOp)
UnpackA op.
UnpackA operation
Operands:
Operand | Description |
---|---|
cb | TTKernel cb |
src_index | 32-bit signless integer |
ttkernel.unpack_set_data_type
(tt::ttkernel::UnpackSetDataTypeOp)
Unpack set DataType op.
Unpack set DataType operation
Attributes:
Attribute | MLIR Type | Description |
---|---|---|
data_type_a | ::mlir::IntegerAttr | TT DataTypes{{% markdown %}}Enum cases: * f32 (`Float32`) * f16 (`Float16`) * bf16 (`BFloat16`) * bfp_f8 (`BFP_Float8`) * bfp_bf8 (`BFP_BFloat8`) * bfp_f4 (`BFP_Float4`) * bfp_bf4 (`BFP_BFloat4`) * bfp_f2 (`BFP_Float2`) * bfp_bf2 (`BFP_BFloat2`) * u32 (`UInt32`) * u16 (`UInt16`) * u8 (`UInt8`){{% /markdown %}} |
data_type_b | ::mlir::IntegerAttr | TT DataTypes{{% markdown %}}Enum cases: * f32 (`Float32`) * f16 (`Float16`) * bf16 (`BFloat16`) * bfp_f8 (`BFP_Float8`) * bfp_bf8 (`BFP_BFloat8`) * bfp_f4 (`BFP_Float4`) * bfp_bf4 (`BFP_BFloat4`) * bfp_f2 (`BFP_Float2`) * bfp_bf2 (`BFP_BFloat2`) * u32 (`UInt32`) * u16 (`UInt16`) * u8 (`UInt8`){{% /markdown %}} |
ttkernel.unreachable
(tt::ttkernel::UnreachableOp)
Unreachable op.
Unreachable operation
Traits: AlwaysSpeculatableImplTrait
, ReturnLike
, Terminator
Interfaces: ConditionallySpeculatable
, NoMemoryEffect (MemoryEffectOpInterface)
, RegionBranchTerminatorOpInterface
Effects: MemoryEffects::Effect{}
ttkernel.untilize_block
(tt::ttkernel::UntilizeBlockOp)
UntilizeBlockOp call.
UntilizeBlockOp operation
Operands:
Operand | Description |
---|---|
cbIn | TTKernel cb |
numTiles | 32-bit signless integer |
cbOut | TTKernel cb |
ttkernel.untilize_init
(tt::ttkernel::UntilizeInitOp)
UntilizeInitOp call.
UntilizeInitOp operation
Operands:
Operand | Description |
---|---|
cbIn | TTKernel cb |
cbOut | TTKernel cb |