ttnn.all_gather
- ttnn.all_gather(input_tensor: ttnn.Tensor, dim: int, cluster_axis: int, mesh_device: MeshDevice, *, num_links: int | None = 1, memory_config: ttnn.MemoryConfig | None = input tensor memory config, num_workers: int | None = None, num_buffers_per_channel: int | None = None, topology: ttnn.Topology | None = ttnn.Topology.Ring) ttnn.Tensor
-
Performs an all-gather operation on multi-device
input_tensor
across all devices.- Parameters:
-
input_tensor (ttnn.Tensor) – multi-device tensor.
dim (int) – Dimension to perform operation.
cluster_axis (int) – Provided a MeshTensor, the axis corresponding to MeshDevice to perform the line-all-gather operation on.
mesh_device (MeshDevice) – Device mesh to perform the line-all-gather operation on.
cluster_axis and mesh_device parameters are applicable only for Linear Topology.
Mesh Tensor Programming Guide : https://github.com/tenstorrent/tt-metal/blob/main/tech_reports/Programming%20Mesh%20of%20Devices/Programming%20Mesh%20of%20Devices%20with%20TT-NN.md
- Keyword Arguments:
-
num_links (int, optional) – Number of links to use for the all-gather operation. Defaults to 1.
memory_config (ttnn.MemoryConfig, optional) – Memory configuration for the operation. Defaults to input tensor memory config.
num_workers (int, optional) – Number of workers to use for the operation. Defaults to None.
num_buffers_per_channel (int, optional) – Number of buffers per channel to use for the operation. Defaults to None.
topology (ttnn.Topology, optional) – The topology configuration to run the operation in. Valid options are Ring and Linear. Defaults to ttnn.Topology.Ring.
- Returns:
-
ttnn.Tensor – the output tensor.
Example
>>> full_tensor = torch.randn([1, 1, 32, 256], dtype=torch.bfloat16) >>> physical_device_ids = ttnn.get_t3k_physical_device_ids_ring() >>> mesh_device = ttnn.open_mesh_device(ttnn.MeshShape(1, 8), physical_device_ids=physical_device_ids[:8]) >>> ttnn_tensor = ttnn.from_torch( full_tensor, dtype=input_dtype, device=mesh_device, layout=layout, memory_config=mem_config, mesh_mapper=ShardTensor2dMesh(mesh_device, mesh_shape=(1, 8), dims=(-1, -2))) >>> ttnn_tensor = ttnn.to_device(ttnn_tensor, mesh_device) >>> output = ttnn.all_gather(ttnn_tensor, dim=0, topology=ttnn.Topology.Ring)