D2MOp - tt-mlir documentation

`d2m.empty` (tt::d2m::EmptyOp)

Empty tensor allocation operation (D2M).

Syntax:

operation ::= `d2m.empty` `(` `)` attr-dict `:` type($result)

Create an uninitialized tensor with the specified shape, element type and encoding.

Interfaces: BufferizableOpInterface, MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Allocate on ::mlir::SideEffects::DefaultResource}

Results:

Result	Description
`result`	ranked tensor of any type values

`d2m.full` (tt::d2m::FullOp)

Creates a tensor filled with the specified value (D2M).

Syntax:

operation ::= `d2m.full` attr-dict `:` type($result)

Tensor operation to create a tensor filled with a specified value. Given a shape and a fill_value, produces a tensor with the shape, filled with the specified value.

Traits: AlwaysSpeculatableImplTrait

Interfaces: BufferizableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute	MLIR Type	Description
`shape`	::mlir::DenseI32ArrayAttr	i32 dense array attribute
`fill_value`	::mlir::Attribute	32-bit float attribute or 32-bit signless integer attribute

Results:

Result	Description
`result`	ranked tensor of any type values

`d2m.generic` (tt::d2m::GenericOp)

Generically dispatch work to a grid of cores (D2M).

Syntax:

operation ::= `d2m.generic` attr-dict `\n`
              ` ` ` ` ` ` ` ` `ins` `(` $inputs (`:` type($inputs)^)? `)` `\n`
              ` ` ` ` ` ` ` ` `outs` `(` $outputs  `:` type($outputs) `)` ` `  $regions (`:`  type($results)^ )?

Same semantics as D2M generic; carries regions for compute/datamovement to be consumed by the metal path.

Traits: AttrSizedOperandSegments, NoTerminator

Interfaces: BufferizableOpInterface, DestinationStyleOpInterface, MemoryEffectOpInterface, OpAsmOpInterface

Attributes:

Attribute	MLIR Type	Description
`grid`	::mlir::tt::ttcore::GridAttr	TT grid attribute {{% markdown %}} TT grid attribute {{% /markdown %}}
`block_factors`	::mlir::ArrayAttr	64-bit integer array attribute
`indexing_maps`	::mlir::ArrayAttr	AffineMap array attribute
`iterator_types`	::mlir::ArrayAttr
`threads`	::mlir::ArrayAttr

Operands:

Operand	Description
`inputs`	variadic of ranked tensor of any type values or non-0-ranked.memref of any type values
`outputs`	variadic of ranked tensor of any type values or non-0-ranked.memref of any type values

Results:

Result	Description
`results`	variadic of ranked tensor of any type values

`d2m.mesh_shard` (tt::d2m::MeshShardOp)

Mesh shard operation (D2M).

Syntax:

operation ::= `d2m.mesh_shard` $input attr-dict `:` type($input) `->` type($result)

MeshShard op shards the inputs (FullToShard) or concatenates the outputs (ShardToFull) for ccl ops.

Traits: AlwaysSpeculatableImplTrait

Interfaces: BufferizableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute	MLIR Type	Description
`shard_type`	::mlir::tt::ttcore::MeshShardTypeAttr	MeshShard shard_type attribute in TT dialect {{% markdown %}} Define sharded tensor data of mesh_shard op. - Identity: input and output tensors are pre-sharded (same data) and no sharding is required. - Replicate: all of the devices has full tensor (same data). - Maximal: one or part of the devcices has full tensor (same data). - Devices: all or part of the devices has sharded (partial) tensor (different data). {{% /markdown %}}
`shard_direction`	::mlir::tt::ttcore::MeshShardDirectionAttr	TT MeshShardDirection
`shard_shape`	::mlir::DenseI64ArrayAttr	i64 dense array attribute
`shard_dims`	::mlir::DenseI64ArrayAttr	i64 dense array attribute

Operands:

Operand	Description
`input`	ranked tensor of any type values or non-0-ranked.memref of any type values

Results:

Result	Description
`result`	ranked tensor of any type values or non-0-ranked.memref of any type values

`d2m.stream_layout` (tt::d2m::StreamLayoutOp)

Stream layout (D2M)

Represent a streaming relationship between a source tensor/memref and a storage buffer, producing a view result.

Traits: AlwaysSpeculatableImplTrait

Interfaces: BufferizableOpInterface, ConditionallySpeculatable, D2M_ViewOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OpAsmOpInterface

Effects: MemoryEffects::Effect{}

Operands:

Operand	Description
`input`	ranked tensor of any type values or non-0-ranked.memref of any type values
`storage`	ranked tensor of any type values or non-0-ranked.memref of any type values

Results:

Result	Description
`result`	ranked tensor of any type values or non-0-ranked.memref of any type values

`d2m.to_device` (tt::d2m::ToDeviceOp)

Transfer data from host to device.

Syntax:

operation ::= `d2m.to_device` $input `,` $output `layout` `=` $layout attr-dict `:` type($input) `into` type($output) (`->` type($results)^)?

ToDevice operation transfers tensor data from host (system) memory to device memory. This is the explicit host-to-device data movement operation that replaces the overloaded system transfer case of ToLayoutOp.

The layout attribute stores the device layout information needed for the transfer.

#layout = #ttcore.metal_layout<8192x128x1, undef, <1x1>, memref<64x128xf32, #l1_>>
%1 = d2m.to_device %arg0, %0 layout = #layout : tensor<64x128xf32> into tensor<64x128xf32, #layout>

Interfaces: BufferizableOpInterface, MemoryEffectOpInterface

Attributes:

Attribute MLIR Type Description

Attribute	MLIR Type	Description
`layout`	::mlir::tt::ttcore::MetalLayoutAttr	Tensor layout attribute with explicit physical shape {{% markdown %}} The tensor layout attribute captures how tensor data is sharded across a grid of devices/cores and is laid out in memory. Note that the presence of this attribute implies that the tensor shape includes sharding (i.e. the first half of the tensor shape represents the grid shape). Some high level goals: - Logical shapes: Store the original tensor shape and rank intact and agnostic to underlying storage layout. Keeping the logical shapes not only makes some graph transformations vastly simpler, in particular convs, but it makes the lowered IR much easier to read and reason about. The original tensor shapes leave breadcrumbs that make it much easier to map back to the input representation. - Collapsed dims: We may collapse dimensions during transformation, but it is important we capture this information such that it is not lost during tensor transformation. The collapsed_intervals field stores the collapses performed during conversion from logical_shape to physical tensor shape. - Padding: store the desired alignments s.t. padding can be simply encoded; dim_alignments field represents alignment along each logical dim during collapse. - Memref translation: ensure we have all necessary info s.t. we can trivally lower a tensor into a memref without any intermediate passes. For a logical tensor of shape [H, W] distributed across a grid [GY, GX], the tensor shape would be: - Without tiling: [GY, GX, H/GY, W/GX] - With tiling: [GY, GX, H/GY/TH, W/GX/TW, TH, TW] where TH,TW are tile dimensions This makes the representation 1:1 with memrefs and eliminates the need for shape conversion passes. Examples: ```mlir // Logical 8x300 tensor distributed across 1x2 grid: // tensor<1x2x8x150xf32, #tt.metal_layout<logical_shape=8x300, ...>> // Logical 1024x1024 tensor distributed across 2x2 grid with 32x32 tiles: // tensor<2x2x16x16x!ttcore.tile<32x32xf32>, #tt.metal_layout<logical_shape=1024x1024, ...>> ``` {{% /markdown %}}

layout

::mlir::tt::ttcore::MetalLayoutAttr

Tensor layout attribute with explicit physical shape

{{% markdown %}} The tensor layout attribute captures how tensor data is sharded across a grid of devices/cores and is laid out in memory. Note that the presence of this attribute implies that the tensor shape includes sharding (i.e. the first half of the tensor shape represents the grid shape).

Some high level goals:
  - **Logical shapes**: Store the original tensor shape and rank intact and agnostic
    to underlying storage layout.
    Keeping the logical shapes not only makes some graph transformations vastly
    simpler, in particular convs, but it makes the lowered IR much easier to read
    and reason about.  The original tensor shapes leave breadcrumbs that make it
    much easier to map back to the input representation.
  - **Collapsed dims**: We may collapse dimensions during transformation, but it
    is important we capture this information such that it is not lost during tensor
    transformation.  The collapsed_intervals field stores the collapses performed
    during conversion from logical_shape to physical tensor shape.
  - **Padding**: store the desired alignments s.t. padding can be simply encoded;
  dim_alignments field represents alignment along each logical dim during collapse.
  - **Memref translation**: ensure we have all necessary info s.t. we can trivally
    lower a tensor into a memref without any intermediate passes.

For a logical tensor of shape [H, W] distributed across a grid [GY, GX], the tensor shape would be:
- Without tiling: [GY, GX, H/GY, W/GX]
- With tiling: [GY, GX, H/GY/TH, W/GX/TW, TH, TW] where TH,TW are tile dimensions

This makes the representation 1:1 with memrefs and eliminates the need for shape conversion passes.

Examples:
```mlir
// Logical 8x300 tensor distributed across 1x2 grid:
// tensor<1x2x8x150xf32, #tt.metal_layout<logical_shape=8x300, ...>>

// Logical 1024x1024 tensor distributed across 2x2 grid with 32x32 tiles:
// tensor<2x2x16x16x!ttcore.tile<32x32xf32>, #tt.metal_layout<logical_shape=1024x1024, ...>>
```

Operands:

Operand	Description
`input`	ranked tensor of any type values or non-0-ranked.memref of any type values
`output`	ranked tensor of any type values or non-0-ranked.memref of any type values

Results:

Result	Description
`results`	variadic of ranked tensor of any type values

`d2m.to_host` (tt::d2m::ToHostOp)

Transfer data from device to host.

Syntax:

operation ::= `d2m.to_host` $input `,` $output `layout` `=` $layout attr-dict `:` type($input) `into` type($output) (`->` type($results)^)?

ToHost operation transfers tensor data from device memory to host (system) memory. This is the explicit device-to-host data movement operation that replaces the overloaded system transfer case of ToLayoutOp.

The layout attribute stores the device layout information needed for the transfer.

#layout = #ttcore.metal_layout<8192x128x1, undef, <1x1>, memref<64x128xf32, #l1_>>
%1 = d2m.to_host %arg0, %0 layout = #layout : tensor<64x128xf32, #layout> into tensor<64x128xf32>

Interfaces: BufferizableOpInterface, MemoryEffectOpInterface

Attributes:

Attribute MLIR Type Description

Attribute	MLIR Type	Description
`layout`	::mlir::tt::ttcore::MetalLayoutAttr	Tensor layout attribute with explicit physical shape {{% markdown %}} The tensor layout attribute captures how tensor data is sharded across a grid of devices/cores and is laid out in memory. Note that the presence of this attribute implies that the tensor shape includes sharding (i.e. the first half of the tensor shape represents the grid shape). Some high level goals: - Logical shapes: Store the original tensor shape and rank intact and agnostic to underlying storage layout. Keeping the logical shapes not only makes some graph transformations vastly simpler, in particular convs, but it makes the lowered IR much easier to read and reason about. The original tensor shapes leave breadcrumbs that make it much easier to map back to the input representation. - Collapsed dims: We may collapse dimensions during transformation, but it is important we capture this information such that it is not lost during tensor transformation. The collapsed_intervals field stores the collapses performed during conversion from logical_shape to physical tensor shape. - Padding: store the desired alignments s.t. padding can be simply encoded; dim_alignments field represents alignment along each logical dim during collapse. - Memref translation: ensure we have all necessary info s.t. we can trivally lower a tensor into a memref without any intermediate passes. For a logical tensor of shape [H, W] distributed across a grid [GY, GX], the tensor shape would be: - Without tiling: [GY, GX, H/GY, W/GX] - With tiling: [GY, GX, H/GY/TH, W/GX/TW, TH, TW] where TH,TW are tile dimensions This makes the representation 1:1 with memrefs and eliminates the need for shape conversion passes. Examples: ```mlir // Logical 8x300 tensor distributed across 1x2 grid: // tensor<1x2x8x150xf32, #tt.metal_layout<logical_shape=8x300, ...>> // Logical 1024x1024 tensor distributed across 2x2 grid with 32x32 tiles: // tensor<2x2x16x16x!ttcore.tile<32x32xf32>, #tt.metal_layout<logical_shape=1024x1024, ...>> ``` {{% /markdown %}}

layout

::mlir::tt::ttcore::MetalLayoutAttr

Tensor layout attribute with explicit physical shape

Some high level goals:
  - **Logical shapes**: Store the original tensor shape and rank intact and agnostic
    to underlying storage layout.
    Keeping the logical shapes not only makes some graph transformations vastly
    simpler, in particular convs, but it makes the lowered IR much easier to read
    and reason about.  The original tensor shapes leave breadcrumbs that make it
    much easier to map back to the input representation.
  - **Collapsed dims**: We may collapse dimensions during transformation, but it
    is important we capture this information such that it is not lost during tensor
    transformation.  The collapsed_intervals field stores the collapses performed
    during conversion from logical_shape to physical tensor shape.
  - **Padding**: store the desired alignments s.t. padding can be simply encoded;
  dim_alignments field represents alignment along each logical dim during collapse.
  - **Memref translation**: ensure we have all necessary info s.t. we can trivally
    lower a tensor into a memref without any intermediate passes.

For a logical tensor of shape [H, W] distributed across a grid [GY, GX], the tensor shape would be:
- Without tiling: [GY, GX, H/GY, W/GX]
- With tiling: [GY, GX, H/GY/TH, W/GX/TW, TH, TW] where TH,TW are tile dimensions

This makes the representation 1:1 with memrefs and eliminates the need for shape conversion passes.

Examples:
```mlir
// Logical 8x300 tensor distributed across 1x2 grid:
// tensor<1x2x8x150xf32, #tt.metal_layout<logical_shape=8x300, ...>>

// Logical 1024x1024 tensor distributed across 2x2 grid with 32x32 tiles:
// tensor<2x2x16x16x!ttcore.tile<32x32xf32>, #tt.metal_layout<logical_shape=1024x1024, ...>>
```

Operands:

Operand	Description
`input`	ranked tensor of any type values or non-0-ranked.memref of any type values
`output`	ranked tensor of any type values or non-0-ranked.memref of any type values

Results:

Result	Description
`results`	variadic of ranked tensor of any type values

`d2m.to_layout` (tt::d2m::ToLayoutOp)

Layout op.

Syntax:

operation ::= `d2m.to_layout` $input `,` $output `:` type($input) `into` type($output) attr-dict (`->` type($results)^)?

ToLayout operation, transition tensors from one layout to another. Some examples include:

Transitioning between different memory spaces, e.g. DRAM to L1.
Transitioning between different data types, e.g. f32 to f16.
Transitioning between different tile sizes, e.g. 1x16 to 32x32
Transitioning between different tensor sharding
Some combination of the above

#layout = #ttcore.metal_layout<8192x128x1, undef, <1x1>, memref<64x128xf32, #system>>
#layout1 = #ttcore.metal_layout<8192x128x1, undef, <1x1>, memref<64x128xf32, #l1_>>
%1 = "d2m.to_layout"(%arg0, %0) : (tensor<64x128xf32, #layout>, tensor<64x128xf32, #layout1>) -> tensor<64x128xf32, #layout1>

Interfaces: BufferizableOpInterface, MemoryEffectOpInterface

Operands:

Operand	Description
`input`	ranked tensor of any type values or non-0-ranked.memref of any type values
`output`	ranked tensor of any type values or non-0-ranked.memref of any type values

Results:

Result	Description
`results`	variadic of ranked tensor of any type values

`d2m.view_layout` (tt::d2m::ViewLayoutOp)

View Layout op (D2M subset)

Syntax:

operation ::= `d2m.view_layout` $input attr-dict `:` type($input) `->` type($result)

Create a representational view of a tensor/memref with a different layout. This is a no-op for codegen; consumers are expected to compose layouts.

Traits: AlwaysSpeculatableImplTrait

Interfaces: BufferizableOpInterface, ConditionallySpeculatable, D2M_ViewOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OpAsmOpInterface

Effects: MemoryEffects::Effect{}

Attributes:

Attribute	MLIR Type	Description
`reinterpretLayout`	::mlir::BoolAttr	bool attribute

Operands:

Operand	Description
`input`	ranked tensor of any type values or non-0-ranked.memref of any type values

Results:

Result	Description
`result`	ranked tensor of any type values or non-0-ranked.memref of any type values