matmul_block

void ckernel::mm_block_init(uint32_t in0_cb_id = 0, uint32_t in1_cb_id = 1, uint32_t out_cb_id = 16, const uint32_t transpose = 0, uint32_t ct_dim = 1, uint32_t rt_dim = 1, uint32_t kt_dim = 1)

Initialization for matmul_block operation. Must be called before matmul_block.

Return value: None

Argument

Description

Type

Valid Range

Required

in0_cb_id

The identifier of the first input circular buffer (CB)

uint32_t

0 to 31

False

in1_cb_id

The identifier of the second input circular buffer (CB)

uint32_t

0 to 31

False

out_cb_id

The identifier of the output circular buffer (CB)

uint32_t

0 to 31

False

ct_dim

the number of columns of the output matrix in tiles

uint32_t

1 to 8 in half-sync mode, 1 to 16 in full-sync mode

False

rt_dim

the number of rows of the output matrix in tiles

uint32_t

1 to 8 in half-sync mode, 1 to 16 in full-sync mode

False

kt_dim

the inner dim of the input matrices in tiles

uint32_t

1 to 2^32-1

False

void ckernel::mm_block_init_short(uint32_t in0_cb_id = 0, uint32_t in1_cb_id = 1, const uint32_t transpose = 0, uint32_t ct_dim = 1, uint32_t rt_dim = 1, uint32_t kt_dim = 1)

A short version of matmul_block initialization. Configure the unpacker and math engine to matmul mode.

Return value: None

Argument

Description

Type

Valid Range

Required

in0_cb_id

The identifier of the first input circular buffer (CB)

uint32_t

0 to 31

False

in1_cb_id

The identifier of the second input circular buffer (CB)

uint32_t

0 to 31

False

transpose

The transpose flag for performing transpose operation on B

uint32_t

Any positive value will indicate tranpose is set

False

ct_dim

The coloumn dimension for the output block.

uint32_t

Must be equal to block B column dimension

False

rt_dim

The row dimension for the output block.

uint32_t

Must be equal to block A row dimension

False

kt_dim

The inner dimension.

uint32_t

Must be equal to block A column dimension

False

void ckernel::mm_block_init_short_with_dt(uint32_t in0_cb_id = 0, uint32_t in1_cb_id = 1, uint32_t old_in1_cb_id = 2, const uint32_t transpose = 0, uint32_t ct_dim = 1, uint32_t rt_dim = 1, uint32_t kt_dim = 1)

A short version of matmul_block initialization. It is used to reconfigure srcA of the compute engine back to matmul mode.

Return value: None

Argument

Description

Type

Valid Range

Required

in0_cb_id

The identifier of the first input circular buffer (CB)

uint32_t

0 to 31

False

in1_cb_id

The identifier of the second input circular buffer (CB)

uint32_t

0 to 31

False

old_in1_cb_id

The identifier of the old in1_cb_id circular buffer (CB)

uint32_t

0 to 31

False

ct_dim

The coloumn dimension for the output block.

uint32_t

Must be equal to block B column dimension

False

rt_dim

The row dimension for the output block.

uint32_t

Must be equal to block A row dimension

False

kt_dim

The inner dimension.

uint32_t

Must be equal to block A column dimension

False

void ckernel::matmul_block(uint32_t in0_cb_id, uint32_t in1_cb_id, uint32_t in0_tile_index, uint32_t in1_tile_index, uint32_t idst, const uint32_t transpose, uint32_t ct_dim, uint32_t rt_dim, uint32_t kt_dim)

Performs block-sized matrix multiplication C=A*B between the blocks in two different input CBs and writes the result to DST. The DST register buffer must be in acquired state via acquire_dst call. This call is blocking and is only available on the compute engine.

Return value: None

Argument

Description

Type

Valid Range

Required

in0_cb_id

The identifier of the first input circular buffer (CB)

uint32_t

0 to 31

True

in1_cb_id

The identifier of the second input circular buffer (CB)

uint32_t

0 to 31

True

in0_tile_index

The index of the tile in block A from the first input CB

uint32_t

Must be less than the size of the CB

True

in1_tile_index

The index of the tile in block B from the second input CB

uint32_t

Must be less than the size of the CB

True

idst

The index of the tile in DST REG to which the result C will be written.

uint32_t

Must be less than the acquired size of DST REG

True

transpose

The transpose flag for performing transpose operation on tiles in B.

bool

Must be true or false

True

ct_dim

The coloumn dimension for the output block.

uint32_t

Must be equal to block B column dimension

True

rt_dim

The row dimension for the output block.

uint32_t

Must be equal to block A row dimension

True

kt_dim

The inner dimension.

uint32_t

Must be equal to block A column dimension

True