Introduction
The TT-Forge FE is a graph compiler designed to optimize and transform computational graphs for deep learning models, enhancing their performance and efficiency.
Built on top of the TT-MLIR backend, TT-Forge FE is an integral component of the TT-Forge project, which provides a comprehensive suite of tools for optimizing and deploying deep learning models on Tenstorrent hardware.
Main project goals are:
- Provide abstraction of many different frontend frameworks (PyTorch, TensorFlow, ONNX, etc.)
- Compile many kinds of model architectures without custom modification and with great performance (e.g. Transformers, CNNs, etc.)
- Abstract all Tenstorrent device architectures (e.g. Wormhole, Blackhole, etc.)
Architecture Overview
TT-Forge is a comprehensive compiler designed to facilitate the development and optimization of machine learning models. It encompasses various components, each serving a specific purpose in the compiling and running machine learning pipelines. This document provides an overview of the key components with focus on TT-Forge-FE.
Table of contents
TT-Forge Overview
TT-TVM Overview
TVM IR
Coming soon!
TVM Compile
Coming soon!
Relay Compile Passes
Coming soon!
Forge Compile Passes
Coming soon!
Partition Graph
Coming soon!
Construct Inputs, Constants and Ops
Coming soon!
Generate Forge-FE Module
Coming soon!
Standalone Forge-FE Module
Coming soon!
TT-Forge-FE Overview
Initialize Compile
Coming soon!
Generate Initial Graph (TT-TVM)
Coming soon!
Post Initial Graph passes
Coming soon!
Consteval
Coming soon!
Autograd
Coming soon!
Post Autograd
Coming soon!
Pre Lowering
Coming soon!
Graph Split
Coming soon!
Compiler TTIR
Coming soon!
Output Binary
Coming soon!
Building
Following page describes how to build the project on your local machine.
Prerequisites
Main project dependencies are:
- Clang 17
- Ninja
- CMake 3.20 or higher
- Git LFS
- Python 3.10 or higher
On Ubuntu 22.04 systems, you can install these dependencies using the following commands:
# Update package list
sudo apt update -y
sudo apt upgrade -y
# Install Clang
sudo apt install clang-17
# Install Ninja
sudo apt install ninja-build
# Install CMake
sudo apt remove cmake -y
pip3 install cmake --upgrade
cmake --version
# Install Git LFS
sudo apt install git-lfs
# Check Python version
python3 --version
Build environment
This is one off step to build the toolchain and create virtual environment for tt-forge. Generally you need to run this step only once, unless you want to update the toolchain (LLVM).
First, it's required to create toolchain directories. Proposed example creates directories in default paths. You can change the paths if you want to use different locations (see build environment section below).
# FFE related toolchain (dafault path)
sudo mkdir -p /opt/ttforge-toolchain
sudo chown -R $USER /opt/ttforge-toolchain
# MLIR related toolchain (default path)
sudo mkdir -p /opt/ttmlir-toolchain
sudo chown -R $USER /opt/ttmlir-toolchain
Build FFE environment:
# Inicialize required env vars
source env/activate
# Initialize and update submodules
git submodule update --init --recursive -f
# Build environment
cmake -B env/build env
cmake --build env/build
Build Forge
# Activate virtual environment
source env/activate
# Build Forge
cmake -G Ninja -B build
cmake --build build
You can pass additional options to the cmake
command to customize the build. For example, to build everything in debug mode, you can run:
cmake -G Ninja -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build
List of commonly used options:
-DCMAKE_BUILD_TYPE=Debug|Release
- Build type (Debug, Release)-DTTMLIR_RUNTIME_DEBUG=ON|OFF
- Build runtime debug tools (more logging, debug environment flags)
Incremental build
If you have made changes to the C++ sources (of the tt-forge-fe
compiler, tt-mlir
or tt-metal
), you might want to do an incremental build to save time. This can be done by running the following command:
# If you are not already inside the virtual environment, activate it
source env/activate
cmake --build build -- install_ttforge
This will build tt-forge-fe
C++ sources and the dependencies (tt-mlir
, tt-metal
) and install them in the virtual environment.
Build docs
To build documentation mdbook
is required, see the installation guide here.
After installing mdbook
, run the following commands to build and serve the documentation:
source env/activate
cmake --build build -- docs
# Serve the documentation
mdbook serve build/docs
Note:
mdbook serve
will by default create a local server athttp://localhost:3000
.
Note: For custom port, just specify
-p
attribute.
E.g.mdbook serve build/docs -p 5005
, and visithttp://localhost:5005
.
Build Cleanup
To ensure a clean build environment, follow these steps to remove existing build artifacts:
-
Clean only Forge FE build artifacts:
rm -rf build
Note: This command removes the
build
directory and all its contents, effectively cleaning up the build artifacts specific to Forge FE. -
Clean all Forge build artifacts:
./clean_build.sh
Note: This script executes a comprehensive cleanup, removing all build artifacts across the entire Forge project, ensuring a clean slate for subsequent builds.
Note:
clean_build.sh
script will not clean toolchain (LLVM) build artifacts and dependencies. -
Clean everything (including environment):
./clean_build.sh rm -rf env/build third_party/tt-mlir/env/build
Note: This should rarely be needed, as it removes the entire build and environment (consequently entire toolchain will need to be rebuilt).
Useful build environment variables
TTMLIR_TOOLCHAIN_DIR
- Specifies the directory where TTMLIR dependencies will be installed. Defaults to/opt/ttmlir-toolchain
if not defined.TTMLIR_VENV_DIR
- Specifies the virtual environment directory for TTMLIR. Defaults to/opt/ttmlir-toolchain/venv
if not defined.TTFORGE_TOOLCHAIN_DIR
- Specifies the directory where tt-forge dependencies will be installed. Defaults to/opt/ttforge-toolchain
if not defined.TTFORGE_VENV_DIR
- Specifies the virtual environment directory for tt-forge. Defaults to/opt/ttforge-toolchain/venv
if not defined.TTFORGE_PYTHON_VERSION
- Specifies the Python version to use. Defaults topython3.10
if not defined.
Run tt-forge-fe using Docker image
We provide two Docker images for tt-forge-fe:
- Base Image: This image includes all the necessary preinstalled dependencies.
- Prebuilt Environment Image: This image also comes with a prebuilt environment, allowing you to skip the environment build step.
ghcr.io/tenstorrent/tt-forge-fe/tt-forge-fe-base-ird-ubuntu-22-04
ghcr.io/tenstorrent/tt-forge-fe/tt-forge-fe-ird-ubuntu-22-04
Note: To be able to build tt-forge-fe inside the docker containers, make sure to set yourself as the owner of tt-forge-fe and tt-mlir toolchain directories:
sudo chown -R $USER /opt/ttforge-toolchain
sudo chown -R $USER /opt/ttmlir-toolchain
Testing
This page describes how to run different kinds of tests in the tt-forge-fe
project. If you haven't built the project yet,
please refer to the Build page.
Unit tests
To build the unit tests, run the following command:
cmake --build build -- build_unit_tests
To run the unit tests (this will also build the tests if they are not built):
cmake --build build -- run_unit_tests
Note: The unit tests are built in the
build/forge/csrc/test
directory. From there, you can run targeted tests directly.
- For example, to run all the tests defined in
forge/csrc/test/passes/
use:./build/forge/csrc/test/test_passes
- You can further filter the tests by using the
--gtest_filter
flag:./build/forge/csrc/test/test_passes --gtest_filter=MMFuseBias/MMFuseBias.mm_fuse_bias/3
End to end tests
For running the end-to-end tests we use the pytest
framework. To run these tests, you need to be on a machine with a Tenstorrent
Wormhole device. Also, we are still in the process of cleaning up the old tests, so not all tests are working. For a list of green
tests, consult pytest.ini
.
Note: Make sure that you have activated the python environment before running the tests.
To run all tests defined in /test/mlir/test_ops.py
use:
pytest -svv forge/test/mlir/test_ops.py
To run a specific test, use the following:
pytest -svv forge/test/mlir/test_ops.py::test_add
- The
-svv
flag is optional and used to display more information about the test run.
Tools
This section will cover setup of various tools that can help you with development of tt-forge-fe.
Pre-commit
We have defined various pre-commit hooks that check the code for formatting, licensing issues, etc.
To install pre-commit, run the following command:
source env/activate
pip install pre-commit
After installing pre-commit, you can install the hooks by running:
pre-commit install
Now, each time you run git commit
the pre-commit hooks (checks) will be executed.
If you have already committed before installing the pre-commit hooks, you can run on all files to "catch up":
pre-commit run --all-files
For more information visit pre-commit
mdbook
We use mdbook
to generate the documentation. To install mdbook
on Ubuntu, run the following commands:
sudo apt install cargo
cargo install mdbook
NOTE: If you don't want to install
mdbook
via cargo (Rust package manager), or this doesn't work for you, consult the official mdbook installation guide.
Gather Unique Ops Configuration
The model's unique ops configuration can be gathered, and the results can be printed to the console and saved as a CSV/XLSX file.
-
FORGE_EXTRACT_UNIQUE_OP_CONFIG_AT
-
By setting this flag to one of the following options, the model's unique ops configuration can be extracted at a specific compilation stage or across all stages:
-
FORGE_EXTRACT_UNIQUE_OP_CONFIG_AT = ALL
Extracts all the unique ops configurations present in the graph at every compilation stage. -
FORGE_EXTRACT_UNIQUE_OP_CONFIG_AT = {GENERATE_INITIAL_GRAPH / POST_INITIAL_GRAPH_PASS / OPTIMIZED_GRAPH / AUTOGRAD / POST_AUTOGRAD_PASS / PRE_LOWERING_GRAPH}
Extracts the unique ops configuration only at the specified compilation stage.
-
-
-
FORGE_PRINT_UNIQUE_OP_CONFIG
- By setting this flag to
1
, all unique configurations will be printed to the console.
- By setting this flag to
-
FORGE_EXPORT_UNIQUE_OP_CONFIG_FILE_TYPE
- By setting this flag to
csv
orxlsx
, all unique configurations will be exported as CSV or XLSX file. The file can be saved to the default path (i.e., the current directory), or it can be saved to a specific path by setting theFORGE_EXPORT_UNIQUE_OP_CONFIG_DIR_PATH
environment variable.
- By setting this flag to
-
FORGE_EXPORT_UNIQUE_OP_CONFIG_CSV_DELIMITER
- The delimiter for the csv file can be set by using this flag. Default delimiter : slash (i.e
/
)
- The delimiter for the csv file can be set by using this flag. Default delimiter : slash (i.e
Note: The delimiter used in the CSV file will be a slash (
/
) to avoid potential parsing issues. Commas (,
) and hyphen (-
) may appear in the op shapes and attributes, which could lead to misinterpretation of the data.
Cross Correlate Models and Ops and Export Model Variants Unique Op Configuration
The models and ops can be cross-correlated and model variants unique op configuration are exported as xlsx file by running the scripts/export_models_ops_correlation.py
python script.
The script will perform the following tasks:
- Run all models until the compile depth specified by the user.
- Export unique op requirements to a file (each model variants has its own directory, in that directory each compile depth has its own file).
- Parse those unique op requirements and create a xlsx file that can be loaded into a google sheet.
- The xlsx file will contain list of models on X axis (i.e. columns) and list of ops on Y axis (i.e. rows/indices).
- Elements in between will contain a checkmark if the desired op from the Y axis (i.e., rows/indices) exists in the model on X axis (i.e., columns).
- Models will be sorted alphabetically.
- Ops will be sorted by the number of occurrences in the models.
Usage
To run the script, use the following command:
python scripts/export_models_ops_correlation.py
Required Options:
Option | Description |
---|---|
-c , --compile_depth (GENERATE_INITIAL_GRAPH, PRE_LOWERING_PASS, etc.) | Choose the compilation depth for extracting ops configuration for the models present in pytest_directory_path . |
-i , --pytest_directory_path | Specify the directory path containing models to test. |
Optional Options:
Option | Description |
---|---|
--cross_correlation_output_file_name | Specify the output xlsx file name for saving the cross correation data between model variants and unique ops. |
--models_unique_op_configs_output_file_name | Specify the output xlsx file name for saving the Models unique op configurations. |
-o , --output_directory_path | Specify the output directory path for saving the xlsx/csv file. |
--export_unique_op_config_file_type (csv, xlsx) | Specify the export unique op configuration file type |
Example:
python scripts/export_models_ops_correlation.py --compile_depth GENERATE_INITIAL_GRAPH --pytest_directory_path forge/test/model_demos/high_prio/nlp/pytorch
How to run standalone MLIR, based on generated Forge-FE MLIR graphs
-
Change Directory to tt-mlir repo in tt-forge-fe third parties
$ cd tt-forge-fe/third_party/tt-mlir
-
Build TTRT (once) - (Inside tt-mlir repo)
$ pip install patchelf $ cmake --build build -- ttrt
-
Save system descriptor artifacts file. For more info, refer ttrt docs
$ ttrt query --save-artifacts
-
Convert TTIR MLIR to TTNN MLIR
-
Save ttir mlir from logs in <some_name>_ttir.mlir . Ex: softmax_check_ttir.mlir
-
The first line of TTIR MLIR should be like below.
module attributes {} {
Ex. softmax_check_ttir.mlir
module attributes {} { func.func @forward(%arg0: tensor<13x89x3xf32> {ttir.name = "x"}, %arg1: tensor<13x89x3xf32> {ttir.name = "y"}, %arg2: tensor<1x89x3xf32> {ttir.name = "input_0_multiply_1"}, %arg3: tensor<1x89x3xf32> {ttir.name = "input_0_reciprocal_0"}) -> (tensor<13x89x3xf32> {ttir.name = "ModelConstEvalPass.output_add_3"}) { %0 = tensor.empty() : tensor<1x89x3xf32> %1 = "ttir.reciprocal"(%arg3, %0) <{operandSegmentSizes = array<i32: 1, 1>, operand_constraints = [#tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>, #tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>]}> : (tensor<1x89x3xf32>, tensor<1x89x3xf32>) -> tensor<1x89x3xf32> %2 = tensor.empty() : tensor<1x89x3xf32> %3 = "ttir.multiply"(%arg2, %1, %2) <{operandSegmentSizes = array<i32: 2, 1>, operand_constraints = [#tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>, #tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>, #tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>]}> : (tensor<1x89x3xf32>, tensor<1x89x3xf32>, tensor<1x89x3xf32>) -> tensor<1x89x3xf32> %4 = tensor.empty() : tensor<13x89x3xf32> %5 = "ttir.add"(%arg0, %arg1, %4) <{operandSegmentSizes = array<i32: 2, 1>, operand_constraints = [#tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>, #tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>, #tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>]}> : (tensor<13x89x3xf32>, tensor<13x89x3xf32>, tensor<13x89x3xf32>) -> tensor<13x89x3xf32> %6 = tensor.empty() : tensor<13x89x3xf32> %7 = "ttir.add"(%3, %5, %6) <{operandSegmentSizes = array<i32: 2, 1>, operand_constraints = [#tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>, #tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>, #tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>]}> : (tensor<1x89x3xf32>, tensor<13x89x3xf32>, tensor<13x89x3xf32>) -> tensor<13x89x3xf32> return %7 : tensor<13x89x3xf32> } }
-
Generate TTNN MLIR from TTIR MLIR
- Replace path to
system_desc.ttsys
to your corresponding path.
$ ./build/bin/ttmlir-opt --ttir-load-system-desc="path=/proj_sw/user_dev/akannan/forge/tt-forge-fe/third_party/tt-mlir/ttrt-artifacts/system_desc.ttsys" --ttir-to-ttnn-backend-pipeline softmax_check_ttir.mlir -o softmax_check_ttnn.mlir
- Replace path to
-
-
Create Flatbuffers Serialized Binary
- Generate flatbuffer binary from TTNN MLIR
$ ./build/bin/ttmlir-translate --ttnn-to-flatbuffer softmax_check_ttnn.mlir -o softmax_check.ttnn
- Generate flatbuffer binary from TTNN MLIR
-
Run TTNN Binary
$ ttrt run softmax_check.ttnn