compute_kernel_hw_startup
-
template<SrcOrder src_order = SrcOrder::Regular>
void ckernel::compute_kernel_hw_startup(uint32_t icb0, uint32_t icb1, uint32_t ocb)
-
Performs the required hardware initialization for all subsequent operations in the compute kernel. This function should be called exactly once at the very beginning of the kernel, before any operation-specific initialization functions (such as reduce_init, tilize_init, etc.). The circular buffer (CB) IDs provided to this function must match those used in the next operation-specific initialization function. If the operands for the next operation require a different data format than what was configured here, you must call one of the reconfig_data_format functions before proceeding with the next initialization. Similarly, if the next operation requires different properties (such as tile or face dimensions), you must ensure that the same CB IDs are used as in this function.
The src_order template parameter selects how (icb0, icb1) map onto SrcA/SrcB; this is the single piece of operation-specific knowledge startup needs, because the per-source-register state it programs (formats, tile/face dimensions, tile sizes) depends on that mapping. Use SrcOrder::Regular for all operations except matmul, which must use SrcOrder::Reverse (see the SrcOrder documentation). The (icb0, icb1) arguments are always passed in natural operand order (in0, in1) regardless of the tag.
NOTE: This function performs MMIO writes, which are slow and almost exclusively require the idle state of the execution units that should be configured (PACK, MATH, UNPACK, CFG, etc.). This is why it is unsafe to call this function in the middle of a kernel execution. This function should be called only once at the beginning of the kernel, before any other calls to Compute API are made (either init or other). Calling this function after other API calls may lead cause race conditions and undefined behavior which can be hard to debug.
Return value: None
Param Type
Name
Description
Type
Valid Range
Required
Template
src_order
How icb0/icb1 map onto SrcA/SrcB (Regular or Reverse)
SrcOrder
N/A
False
Function
icb0
The identifier of the circular buffer (CB) containing operand A
uint32_t
0 to 31
True
Function
icb1
The identifier of the circular buffer (CB) containing operand B
uint32_t
0 to 31
True
Function
ocb
The identifier of the output circular buffer (CB)
uint32_t
0 to 31
True
-
void ckernel::compute_kernel_hw_startup(uint32_t icb0, uint32_t ocb)
-
Convenience overload for hardware initialization when only one input circular buffer is used. Both input operands (srcA and srcB) will be programmed using the same circular buffer identifier (
icb0). Internally, this calls the three-parameter version withicb0passed for both input operands.Param Type
Name
Description
Type
Valid Range
Required
Function
icb0
The identifier of the circular buffer (CB) used for both input ops
uint32_t
0 to 31
True
Function
ocb
The identifier of the output circular buffer (CB)
uint32_t
0 to 31
True