ttnn.dram_prefetcher
- ttnn.dram_prefetcher(tensors: List[ttnn.Tensor], tensor_addrs: ttnn.Tensor, num_layers: int, global_cb: GlobalCircularBuffer, enable_performance_mode: bool | None) ttnn.Tensor
-
Asynchronously pre-fetch tensors from DRAM into the neighbouring L1 cores. This utilizes a global circular buffer to push data on consumer cores.
- Parameters:
-
tensors (List[ttnn.Tensor]) – A list of tensor objects to be pre-fetched.
tensor_addrs (ttnn.Tensor) – A tensor (row major layout) that contains memory addresses corresponding to the tensor locations in DRAM. The format should be as follows: [t1_l1, t2_l1, …, t1_l2, t2_l2, …, t1_l3, t2_l3, …]
num_layers (int) – The number of layers in the pipeline or the model for which tensors need to be pre-fetched.
global_cb (GlobalCircularBuffer) – A global cb object, used internally to manage data movement across dram reader cores, and downstream consumer cores.
enable_performance_mode (bool, optional) – If set to true, the operation will be optimized for performance. May lead to ND behavior on wormhole 4U systems!
- Returns:
-
ttnn.Tensor – empty tensor (TODO: Should return None)