ttnn.adaptive_avg_pool2d

ttnn.adaptive_avg_pool2d(input_tensor: ttnn.Tensor, batch_size: int, input_h: int, input_w: int, channels: int, output_size: List of [int], *, memory_config: ttnn.MemoryConfig = None, applied_shard_scheme: ttnn.TensorMemoryLayout = None, compute_kernel_config: DeviceComputeKernelConfig = None, deallocate_input: bool = False, reallocate_output: bool = True, queue_id: int = 0) ttnn.Tensor

Applies experimental adaptive average pooling to the input tensor. Unlike regular pooling, adaptive pooling automatically calculates the kernel size and stride to produce the desired output size. The input tensor is expected to be in [NHW, C] format and should be on the device.

Parameters:
  • input_tensor (ttnn.Tensor) – the tensor to be pooled.

  • batch_size (int) – the number of batches (N in a [N, C, H, W] shaped tensor).

  • input_h (int) – the height of the input tensor (H in a [N, C, H, W] shaped tensor).

  • input_w (int) – the width of the input tensor (W in a [N, C, H, W] shaped tensor).

  • channels (int) – the number of channels (C in a [N, C, H, W] shaped tensor).

  • output_size (List of [int]) – the target (h, w) size of the output tensor.

Keyword Arguments:
  • memory_config (ttnn.MemoryConfig, optional) – the memory configuration for the output tensor. Defaults to None.

  • applied_shard_scheme (ttnn.TensorMemoryLayout, optional) – the sharding scheme to apply to a non-pre-sharded input tensor. Defaults to None.

  • compute_kernel_config (DeviceComputeKernelConfig, optional) – the device compute kernel configuration. Defaults to None.

  • deallocate_input (bool, optional) – whether to deallocate the input tensor. Defaults to False.

  • reallocate_output (bool, optional) – whether to reallocate the output tensor. Defaults to True.

  • queue_id (int, optional) – the queue id to use for the operation. Defaults to 0.

Returns:

ttnn.Tensor – the experimental adaptive average pooled output tensor.

Example

>>> import ttnn
>>> import torch
>>> device = ttnn.open_device(device_id=0, l1_small_size=8192)
>>> nchw_shape = (1, 256, 64, 64)
>>> in_N, in_C, in_H, in_W = nchw_shape
>>> input_shape = (1, 1, in_N * in_H * in_W, in_C)
>>> input = torch.randn(nchw_shape, dtype=torch.bfloat16)
>>> input_perm = torch.permute(input, (0, 2, 3, 1)) # this op expects a [N, H, W, C] format
>>> input_reshape = input_perm.reshape(input_shape) # this op expects [1, 1, NHW, C]
>>> tt_input = ttnn.from_torch(input_reshape, device=device)
>>> tt_output = ttnn.adaptive_avg_pool2d(
                input_tensor=tt_input,
                batch_size=in_N,
                input_h=in_H,
                input_w=in_W,
                channels=in_C,
                output_size=[1, 1],  # Global adaptive pooling
                memory_config=None,
                applied_shard_scheme=ttnn.TensorMemoryLayout.BLOCK_SHARDED,
            )