ttnn.copy_device_to_host_tensor
- ttnn.copy_device_to_host_tensor(device_tensor: ttnn.Tensor, host_tensor: ttnn.Tensor, blocking: bool = True, cq_id: ttnn.QueueId = None) None
-
copy_device_to_host_tensor(device_tensor: ttnn._ttnn.tensor.Tensor, host_tensor: ttnn._ttnn.tensor.Tensor, blocking: bool = True, cq_id: ttnn._ttnn.types.QueueId | None = None) -> None
Copies a tensor from device to host.
- Parameters:
-
device_tensor (ttnn.Tensor) – the tensor to be copied from device to host.
host_tensor (ttnn.Tensor) – the tensor to be copied to.
blocking (bool, optional) – whether the operation should be blocked until the copy is complete. Defaults to True.
cq_id (ttnn.QueueId, optional) – The queue id to use. Defaults to None.
Note
This operations supports tensors according to the following data types and layout:
device/host tensor dtype - layout
BFLOAT16, BFLOAT8_B, BFLOAT4_B, FLOAT32, UINT32, INT32, UINT16, UINT8 - TILE
BFLOAT16, FLOAT32, UINT32, INT32, UINT16, UINT8 - ROW_MAJOR
- Memory Support:
-
Interleaved: DRAM and L1
Height, Width, Block, and ND Sharded: DRAM and L1
- Limitations:
-
Host and Device tensors must be the same shape, have the same datatype, and have the same data layout (ROW_MAJOR or TILE).
Example
# Create a TT-NN tensor and copy it to the host ttnn_tensor = ttnn.rand((2, 3), dtype=ttnn.bfloat16, device=device) host_tensor = ttnn.allocate_tensor_on_host(ttnn_tensor.spec, device) ttnn.copy_device_to_host_tensor(ttnn_tensor, host_tensor) logger.info("Host tensor shape after copying from device", host_tensor.shape)