binary_shift_tile
-
void ckernel::binary_shift_tile_init()
-
Please refer to documentation for any_init.
Left Shift
-
void ckernel::binary_left_shift_tile(uint32_t idst0, uint32_t idst1, uint32_t odst)
-
Performs an elementwise shift operation to the left on the input at idst0, by input at idst1: y = x0 << x1 Both inputs must be of Int32 data type only. Output overwrites odst in DST.
The DST register buffer must be in acquired state via acquire_dst call. This call is blocking and is only available on the compute engine. A maximum of 4 tiles from each operand can be loaded into DST at once, for a total of 8 tiles, when using 16 bit formats. This gets reduced to 2 tiles from each operand for 32 bit formats.
Return value: None
Argument
Description
Type
Valid Range
Required
idst0
The index of the tile in DST register buffer to use as first operand
uint32_t
Must be less than the size of the DST register buffer
True
idst1
The index of the tile in DST register buffer to use as second operand
uint32_t
Must be less than the size of the DST register buffer
True
odst
The index of the tile in DST register buffer to use as output
uint32_t
Must be less than the size of the DST register buffer
True
-
template<bool sign_magnitude_format = false>
void ckernel::binary_left_shift_uint32_tile(uint32_t idst0, uint32_t idst1, uint32_t odst)
-
void ckernel::binary_left_shift_int32_tile(uint32_t idst0, uint32_t idst1, uint32_t odst)
Right Shift
-
void ckernel::binary_right_shift_tile(uint32_t idst0, uint32_t idst1, uint32_t odst)
-
Performs an elementwise shift operation to the right on the input at idst0, by input at idst1: y = x0 >> x1 Both inputs must be of Int32 data type only. Output overwrites odst in DST.
The DST register buffer must be in acquired state via acquire_dst call. This call is blocking and is only available on the compute engine. A maximum of 4 tiles from each operand can be loaded into DST at once, for a total of 8 tiles, when using 16 bit formats. This gets reduced to 2 tiles from each operand for 32 bit formats.
Return value: None
Argument
Description
Type
Valid Range
Required
idst0
The index of the tile in DST register buffer to use as first operand
uint32_t
Must be less than the size of the DST register buffer
True
idst1
The index of the tile in DST register buffer to use as second operand
uint32_t
Must be less than the size of the DST register buffer
True
odst
The index of the tile in DST register buffer to use as output
uint32_t
Must be less than the size of the DST register buffer
True
-
template<bool sign_magnitude_format = false>
void ckernel::binary_right_shift_uint32_tile(uint32_t idst0, uint32_t idst1, uint32_t odst)
-
void ckernel::binary_right_shift_int32_tile(uint32_t idst0, uint32_t idst1, uint32_t odst)
Logical Right Shift
-
void ckernel::binary_logical_right_shift_tile(uint32_t idst0, uint32_t idst1, uint32_t odst)
-
Performs an elementwise logical shift operation to the right on the input at idst0, by input at idst1: y = x0 >> x1 Both inputs must be of Int32 data type only. Output overwrites odst in DST.
The DST register buffer must be in acquired state via acquire_dst call. This call is blocking and is only available on the compute engine. A maximum of 4 tiles from each operand can be loaded into DST at once, for a total of 8 tiles, when using 16 bit formats. This gets reduced to 2 tiles from each operand for 32 bit formats.
Return value: None
Argument
Description
Type
Valid Range
Required
idst0
The index of the tile in DST register buffer to use as first operand
uint32_t
Must be less than the size of the DST register buffer
True
idst1
The index of the tile in DST register buffer to use as second operand
uint32_t
Must be less than the size of the DST register buffer
True
odst
The index of the tile in DST register buffer to use as output
uint32_t
Must be less than the size of the DST register buffer
True
-
template<bool sign_magnitude_format = false>
void ckernel::binary_logical_right_shift_uint32_tile(uint32_t idst0, uint32_t idst1, uint32_t odst)
-
void ckernel::binary_logical_right_shift_int32_tile(uint32_t idst0, uint32_t idst1, uint32_t odst)