matmul_block
-
void ckernel::mm_block_init(uint32_t in0_cb_id = 0, uint32_t in1_cb_id = 1, uint32_t out_cb_id = 16, const uint32_t transpose = 0, uint32_t ct_dim = 1, uint32_t rt_dim = 1, uint32_t kt_dim = 1)
-
Initialization for matmul_block operation. Must be called before matmul_block.
Return value: None
Argument
Description
Type
Valid Range
Required
in0_cb_id
The identifier of the first input circular buffer (CB)
uint32_t
0 to 31
False
in1_cb_id
The identifier of the second input circular buffer (CB)
uint32_t
0 to 31
False
out_cb_id
The identifier of the output circular buffer (CB)
uint32_t
0 to 31
False
ct_dim
the number of columns of the output matrix in tiles
uint32_t
1 to 8 in half-sync mode, 1 to 16 in full-sync mode
False
rt_dim
the number of rows of the output matrix in tiles
uint32_t
1 to 8 in half-sync mode, 1 to 16 in full-sync mode
False
kt_dim
the inner dim of the input matrices in tiles
uint32_t
1 to 2^32-1
False
-
void ckernel::mm_block_init_short(uint32_t in0_cb_id = 0, uint32_t in1_cb_id = 1, const uint32_t transpose = 0, uint32_t ct_dim = 1, uint32_t rt_dim = 1, uint32_t kt_dim = 1)
-
A short version of matmul_block initialization. Configure the unpacker and math engine to matmul mode.
Return value: None
Argument
Description
Type
Valid Range
Required
in0_cb_id
The identifier of the first input circular buffer (CB)
uint32_t
0 to 31
False
in1_cb_id
The identifier of the second input circular buffer (CB)
uint32_t
0 to 31
False
transpose
The transpose flag for performing transpose operation on B
uint32_t
Any positive value will indicate tranpose is set
False
ct_dim
The coloumn dimension for the output block.
uint32_t
Must be equal to block B column dimension
False
rt_dim
The row dimension for the output block.
uint32_t
Must be equal to block A row dimension
False
kt_dim
The inner dimension.
uint32_t
Must be equal to block A column dimension
False
-
void ckernel::mm_block_init_short_with_dt(uint32_t in0_cb_id = 0, uint32_t in1_cb_id = 1, uint32_t old_in1_cb_id = 2, const uint32_t transpose = 0, uint32_t ct_dim = 1, uint32_t rt_dim = 1, uint32_t kt_dim = 1)
-
A short version of matmul_block initialization. It is used to reconfigure srcA of the compute engine back to matmul mode.
Return value: None
Argument
Description
Type
Valid Range
Required
in0_cb_id
The identifier of the first input circular buffer (CB)
uint32_t
0 to 31
False
in1_cb_id
The identifier of the second input circular buffer (CB)
uint32_t
0 to 31
False
old_in1_cb_id
The identifier of the old in1_cb_id circular buffer (CB)
uint32_t
0 to 31
False
ct_dim
The coloumn dimension for the output block.
uint32_t
Must be equal to block B column dimension
False
rt_dim
The row dimension for the output block.
uint32_t
Must be equal to block A row dimension
False
kt_dim
The inner dimension.
uint32_t
Must be equal to block A column dimension
False
-
void ckernel::matmul_block(uint32_t in0_cb_id, uint32_t in1_cb_id, uint32_t in0_tile_index, uint32_t in1_tile_index, uint32_t idst, const uint32_t transpose, uint32_t ct_dim, uint32_t rt_dim, uint32_t kt_dim)
-
Performs block-sized matrix multiplication C=A*B between the blocks in two different input CBs and writes the result to DST. The DST register buffer must be in acquired state via acquire_dst call. This call is blocking and is only available on the compute engine.
Return value: None
Argument
Description
Type
Valid Range
Required
in0_cb_id
The identifier of the first input circular buffer (CB)
uint32_t
0 to 31
True
in1_cb_id
The identifier of the second input circular buffer (CB)
uint32_t
0 to 31
True
in0_tile_index
The index of the tile in block A from the first input CB
uint32_t
Must be less than the size of the CB
True
in1_tile_index
The index of the tile in block B from the second input CB
uint32_t
Must be less than the size of the CB
True
idst
The index of the tile in DST REG to which the result C will be written.
uint32_t
Must be less than the acquired size of DST REG
True
transpose
The transpose flag for performing transpose operation on tiles in B.
bool
Must be true or false
True
ct_dim
The coloumn dimension for the output block.
uint32_t
Must be equal to block B column dimension
True
rt_dim
The row dimension for the output block.
uint32_t
Must be equal to block A row dimension
True
kt_dim
The inner dimension.
uint32_t
Must be equal to block A column dimension
True