Adding a new op to `ttir-builder`

ttir-builder is designed to only create ops supported in TTIR. At the moment, most but not all ops are supported, and new ops are still occasionally added to TTIR. Creating ttir-builder support for an op entails writing a function in tools/ttir-builder/builder.py that will create the op and its golden counterpart.

TTIR op factories

All ops are created when their relevant information is run through the op_proxy function which provides a general interface for proxy-ing and creating ops.

def op_proxy(
    self,
    op_golden_function: Callable,
    op_ttir_function: Callable,
    inputs: List[Operand],
    unit_attrs: List[str] = None,
    organize_ttir_args: Optional[Callable] = None,
    organize_golden_args: Optional[Callable] = None,
    output_shape: Optional[Shape] = None,
    output_type: Optional[Type] = None,
    output_create_fn: Optional[Callable] = None,
    golden_kwargs: dict = {},
    ttir_kwargs: dict = {},
)

Start by finding the TTIR op you wish to replicate in include/ttmlir/Dialect/TTIR/IR/TTIROps.td or the TTIR dialect documentation.

All op attributes should be included as arguments in your function and passed into a proxy function as keyword arguments using ttir_kwargs.

All input operands should be passed into a proxy function using the argument inputs. Output operands are considered inputs and can optionally be passed into inputs if their shape or datatype is relevant to the op's result operand. organize_ttir_args dictates what information gets passed into autogenerated file build/python_packages/ttmlir/dialects/_ttir_ops_gen.py and can be used if operand arguments require special handling.

Before writing a golden function, you need to know exactly what the TTIR op does to its input data because you will have to replicate that exactly using Pytorch operations. This information is usually covered in TTIR documentation, but if not, you may have to take it upon yourself to do some detective work and trial and error.

Writing a golden function is very straightforward if Pytorch has a function that performs exactly the same operation. If so, pass that function into a proxy function as op_golden_function, any keywords into golden_kwargs, and use organize_golden_args if input operands differ from those of the TTIR op.

If Pytorch doesn't have an identical operation, your job is about to get a little harder. Get creative with keyword argument handling, using similar Pytorch operations, and maybe multiple operations. Google is your friend. If you have to figure out how to do something Pytorch doesn't, odds are someone online has encountered the same situation.

Example implementation:

    def cbrt_golden_function(self, in0: Operand, unit_attrs: Optional[List[str]] = None) -> torch.tensor:
        golden = self._get_golden_tensor(in0)
        golden_sign = torch.sign(golden)
        golden_cbrt = torch.pow(torch.abs(golden), 1 / 3)
        return golden_cbrt

    def cbrt(self, in0: Operand, unit_attrs: Optional[List[str]] = None) -> OpView:
        return self.op_proxy(
            cbrt_golden_function,
            ttir.CbrtOp,
            [in0],
            golden_kwargs={"input": golden_sign, "other": golden_cbrt},
            organize_golden_args=lambda i: 0,
            unit_attrs=unit_attrs,
        )

Eltwise operations

Element-wise ops require less specialized handling and call op_proxy through eltwise_proxy.

def eltwise_proxy(
    self,
    op_golden_function: Callable,
    op_ttir_function: Callable,
    inputs: List[Operand],
    unit_attrs: List[str] = None,
)

CCL ops require GoldenCheckLevel to be set to GRAPH_LEVEL and integrate that into their own proxy function.

def ccl_proxy(
    self,
    op_golden_function: Callable,
    op_ttir_function: Callable,
    inputs: List[Operand],
    kwargs: dict = {},
)

Adding Silicon tests

Silicon tests are created in the test/python/golden directory.

pytest test/python/golden/test_ttir_ops.py

Be sure to file an issue for failing tests and add a pytest mark for any failing or unsupported tests. The pytest marks instruct CI to ignore tests.

pytest.mark.skip("Issue number") : skip flatbuffer creation for this test
pytest.mark.fails_golden : expect this test to fail the ttrt golden check
pytest.mark.skip_config(config, ... reason=None): skip test if all of the specified targets/backends per config are present

The skip_config mark here is a little nuanced. By passing in a list of strings representing targets and/or systems (e.g. ["ttmetal", "p150"]) this mark will intelligently skip tests with that configuration. The example given will skip tests lowered to ttmetal iff we are runing on a p150 (i.e. blackhole). This functionality will be expanded to include other axes of test configuration, but target and system are sufficient for our needs at the moment.

For tests exclusive to n300 or llmbox, use the following pytest marks or add them to their respective test files.

pytestmark = pytest.mark.n300
pytestmark = pytest.mark.llmbox

Running Silicon tests

Follow these steps. The directory test/python/golden contains tests for modules, individual ops, and various machines.

1. Build ttmlir
source env/activate
cmake -G Ninja -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang-17 -DCMAKE_CXX_COMPILER=clang++-17 -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DTTMLIR_ENABLE_RUNTIME=ON -DTT_RUNTIME_ENABLE_PERF_TRACE=ON
cmake --build build

2. Build ttrt (sample instructions - subject to change)
cmake --build build -- ttrt

3. Query system
ttrt query --save-artifacts

4. Export system desc file
export SYSTEM_DESC_PATH=/path/to/system_desc.ttsys (path dumped in previous command)

5. Generate test cases
pytest test/python/golden/test_ttir_ops.py

6. Run test cases
ttrt run ttnn
ttrt run ttmetal

Sphinx documentation

Docstrings

Sphinx generates documentation for builder ops from the docstrings in TTIRBuilder functions. This is the structure to follow when writing your docstring

"""
Creates ``ttir.add``.

*Elementwise addition operation.*

Performs elementwise addition between two tensors.
For each pair of corresponding elements, adds the element in the second
tensor to the element in the first tensor.

Mathematical definition: add(x, y) = x + y

.. code-block:: mlir

    // Add corresponding elements
    %result = ttir.add(%lhs, %rhs, %output) : tensor<3xf32>, tensor<3xf32>, tensor<3xf32> -> tensor<3xf32>
    // Input tensors:
    // lhs: [3.5, 0.0, -1.2]
    // rhs: [1.5, 2.0, -3.2]
    // Output tensor:
    // [5.0, 2.0, -4.4]

Parameters
----------
in0 : Operand
    First input tensor
in1 : Operand
    Second input tensor
unit_attrs : Optional[List[str]], optional
    Optional list of unit attributes

Returns
-------
*OpView*
    A tensor containing the elementwise sum of the inputs
"""

Autogen skip

All functions in TTIRBuilder are included in documentation by default. If your op is failing any of the tests, it can't yet be added to the documentation. Custom golden functions also must be excluded. Tag those functions with autodoc_skip.

@autodoc_skip
def bitwise_not(
    self, in0: Operand, unit_attrs: Optional[List[str]] = None
) -> OpView:

tt-mlir documentation