CI
Our CI infrastructure is currently hosted in the cloud. Cloud machines are used and linked as GitHub runners.
Overview
CI automatically triggers on:
- Pull requests - validates code changes before merging
- Pushes to main - typically when PRs are merged
- Nightly runs - comprehensive testing with all components
- Uplift PRs - special PRs that update tt-metal to the latest version
The CI system automatically collects analytics data from each workflow run, including test results and code coverage. It also publishes the latest documentation to GitHub.
Builds
CI performs several types of builds:
Release Builds
- speedy - optimized for performance and speed
- tracy - includes runtime tracing and debug capabilities with performance measurements
Development Builds
- Debug build - includes unit tests and code coverage collection
- MacOS build - ensures cross-platform compatibility
- Wheels - Python package distributions
- Clang-tidy - static code analysis
Release builds include the runtime needed to execute on TT hardware, making them suitable for integration testing. The debug build runs unit tests and generates code coverage reports that are published to Codecov with detailed results linked in PR comments.
Release Build Components
Release builds do more than just compile tt-mlir - they also prepare tests, build tools, and create wheels. Components are configured in .github/settings/build.json:
{ "image": "tracy", "script": "explorer.sh" }
- image: Specifies which release build to use (
speedyortracy) - script: Build script located in
.github/build_scripts/ - if (optional): Links to optional components - only builds when that component is enabled
Before running build scripts, the workflow will activate the default TT-MLIR Python venv and set a number of useful environment variables:
- WORK_DIR - set to repo root
- BUILD_DIR - set to build artifacts
- INSTALL_DIR - set to install artifacts
- BUILD_NAME - name of the build image
Uploading Artifacts from Build Scripts
Build scripts can upload their output files as artifacts for later use in testing. This is especially important for optional components that may not always run.
How it works:
- Build scripts write artifact information to a JSON file specified by the
$UPLOAD_LISTenvironment variable - Each artifact entry contains a
name(identifier) andpath(file location) - The CI system automatically uploads these artifacts after the build completes
Example:
echo "{\"name\":\"ttrt-whl-$BUILD_NAME\",\"path\":\"$WORK_DIR/build/tools/ttrt/build/ttrt*.whl\"}," >> $UPLOAD_LIST
This example uploads Python wheel files with a descriptive name that includes the build type.
Please note
>>as it appends to existing list.
Optional Components
As the codebase grows, CI can become slow and bloated. To keep development efficient, some components are made optional and only run when needed:
When optional components run:
- ✅ Nightly builds (full testing)
- ✅ Uplift PRs (tt-metal version updates)
- ✅ PRs that modify the component's code
- ❌ Regular PRs (unless component files changed)
Good candidates for optional components:
- Mature, stable features that rarely break
- Legacy code that's still supported but not actively developed
- Rarely-used functionality where breakage isn't critical
- Time-intensive wheel builds (when functionality is tested elsewhere)
Configuring Optional Components
Define components in .github/settings/optional-components.yml:
component_name:
- path/to/files/*.py
- specific/file.py
- another/directory/**
The component name can then be referenced in build and test configurations in "if" field:
Example:
emitc:
- test/ttmlir/EmitC/**
- tools/ttnn-standalone/ci_compile_dylib.py
- include/ttmlir/Conversion/TTNNToEmitC/**
- lib/Conversion/TTNNToEmitC/**
Build example:
{ "image": "speedy", "script": "emitc.sh", "if": "emitc" }
Test example:
{ "runs-on": "n150", "image": "speedy", "script": "emitc.sh", "if": "emitc" }
This makes the EmitC component optional - it only builds/tests when EmitC-related files are modified in a PR.
Testing
Testing is performed inside the call-test.yml workflow as run-tests jobs. It uses a matrix strategy, which means that multiple jobs are created and executed on multiple machines using the same job task.
Tests
The tests are defined by a JSON file tests.json inside the .github/settings directory.
Each row in the JSON array represents a test that will execute on a specific machine using a specified (release) build image.
Example:
{ "runs-on": "n150", "image": "tracy", "script": "pykernel.sh" },
{ "runs-on": "n300", "image": "speedy", "script": "ttrt.sh", "args": ["run", "Silicon", "--non-zero"] },
runs-on
Specifies the machine on which the test suite will be executed. Currently supported runners are:
- n150 - Wormhole 1 chip card
- n300 - Wormhole 2 chip card
- llmbox - Loudbox machine with 4 N300 cards
- tg - Galaxy box
- p150 - Blackhole 1 chip card
It is expected that the list will expand soon as machines with blackhole chip family are added to the runner pool.
image
Specifies which release build image to use. It can be:
- speedy
- tracy
Please take a look at the Builds section for a more detailed description of the builds.
script
Test type. It is the name of the BASH script that executes the test. Scripts are located in the .github/test_scripts directory,
and it is possible to create new test types simply by adding scripts to the directory.
args (optional)
This field represents the arguments for the script. This can be omitted, a string, or a JSON array.
reqs (optional)
Specifies additional requirements for test execution. These arguments are passed as the REQUIREMENTS environment variable to the test script.
if (optional)
Specifies the name of optional component. The test will be executed only if optional component is enabled.
Using JSON arrays
The runs-on and image fields can be passed as JSON arrays. With arrays, one can define a test to execute on multiple machines and images. Examples:
{ "runs-on": ["n150","n300"],
"image": ["speedy","tracy"],
"script": "unit" }
Adding New Test
Usually, it is enough to add a single line to the test matrix and your tests will become part of the tt-mlir CI. Here is a checklist of what you should decide before adding it:
- On which TT hardware should your tests run? Put the specific hardware in the "runs-on" field.
- Do your tests run with
ttrtorpytestor any other standard type that other tests also use? Put this decision in the "script" field. - Refer to the test script you've put in the type for interpretation of
argsandreqsparameters.
Each line in the matrix MUST be unique! There is no point in running the same test with the same build image on the same type of hardware.
Consider
Here are a few things to consider:
- Design your
ttrttest so it is generated with a-- check_ttmlirCMake target. These will be generated at compile time and will be available for test jobs. - For pytest, use pytest test discovery to run all tests in subdirectories. In most cases, there is no need for two sets of tests.
- If you want to have separate test reports, do not add additional XML file paths and steps to upload these.
Use
${TTRT_REPORT_PATH}(for ttrt JSON files) or${TEST_REPORT_PATH}(for JUnit XML) because it will be automatically picked up and sent to analytics. - If separate reports are required, treat them as different tests. Add an additional line to the test matrix.
You can use a construct from this example:
cp run_results.json ${TTRT_REPORT_PATH%_*}_ttir_${TTRT_REPORT_PATH##*_}
Adding New Test Script
To design a new test script, add new files to .github/test_scripts
Make sure you set the execution flag for the new test script file (
chmod +x <file>)!
Before running test scripts, the workflow will activate the default TT-MLIR Python venv and set a number of useful environment variables:
- WORK_DIR - set to repo root
- BUILD_DIR - set to build artifacts
- INSTALL_DIR - set to install artifacts
- LD_LIBRARY_PATH - set to install artifacts
liband toolchainlibdirectories - SYSTEM_DESC_PATH - set to
system_desc.ttsyssystem descriptor file generated byttrt - TT_METAL_RUNTIME_ROOT - set to
tt-metalinstall directory - RUNS_ON - machine label script runs on
- IMAGE_NAME - name of the image script runs on
- RUN_ID - id of workflow run, see below.
Also, a soft link is created inside the build directory to the install directory.
Please make sure you implement cleanup logic inside your script and leave the script with the repo in the same state as before execution!
A good practice is to put some comments on how the script interprets arguments (and requirements if applicable).
For example builder.sh:
# arg $1: path to pytest test files
# arg $2: pytest marker expression to select tests to run
# arg $3: "run-ttrt" or predefined additional flags for pytest and ttrt
runttrt=""
TTRT_ARGS=""
PYTEST_ARGS=""
[[ "$RUNS_ON" != "n150" ]] && PYTEST_ARGS="$PYTEST_ARGS --require-exact-mesh"
[[ "$RUNS_ON" == "p150" ]] && TTRT_ARGS="$TTRT_ARGS --disable-eth-dispatch"
for flag in $3; do
[[ "$flag" == "run-ttrt" ]] && runttrt=1
[[ "$flag" == "require-opmodel" ]] && PYTEST_ARGS="$PYTEST_ARGS --require-opmodel"
done
pytest "$1" -m "$2" $PYTEST_ARGS -v --junit-xml=$TEST_REPORT_PATH
if [[ "$runttrt" == "1" ]]; then
ttrt run $TTRT_ARGS ttir-builder-artifacts/
cp run_results.json ${TTRT_REPORT_PATH%_*}_ttir_${TTRT_REPORT_PATH##*_} || true
ttrt run $TTRT_ARGS stablehlo-builder-artifacts/
cp run_results.json ${TTRT_REPORT_PATH%_*}_stablehlo_${TTRT_REPORT_PATH##*_} || true
fi
This script has several types of flags that can be stated concurrently. Arguments are parsed as run-ttrt and other
possible flags for pytest or ttrt. This test uses TTRT_REPORT_PATH, but due to the fact that it has two ttrt runs, it inserts its type inside the filename.
The second example is pytest.sh script:
if [ -n "$REQUIREMENTS" ]; then
eval "pip install $REQUIREMENTS"
fi
export TT_EXPLORER_GENERATED_MLIR_TEST_DIRS=$BUILD_DIR/test/ttmlir/Silicon/TTNN/n150/perf,$BUILD_DIR/test/python/golden/ttnn
export TT_EXPLORER_GENERATED_TTNN_TEST_DIRS=$BUILD_DIR/test/python/golden/ttnn
pytest -ssv "$@" --junit-xml=$TEST_REPORT_PATH
This script uses $REQUIREMENTS to specify additional wheels to be installed. Note how it uses the eval command to expand bash variables where suitable. It also defines some additional environment variables using the provided ones.
Downloading Artifacts
If you need to download artifacts (e.g., wheels) from a workflow run, you can use the following command:
gh run download $RUN_ID --repo tenstorrent/tt-mlir --name <artifact_name>
This command downloads the specified artifact to the current directory. You can specify a different download location using the --dir <directory> option.
To download multiple artifacts matching a pattern, use the --pattern option instead of --name:
gh run download $RUN_ID --repo tenstorrent/tt-mlir --pattern "tt_*.whl"
Note: When using --pattern, artifacts are downloaded into separate subdirectories, even when --dir is specified.
CI Run (under the hood)
Test runs are prepared in the prepare-run job when the input test matrix is transformed into a job test matrix that will be used for test runs.
All jobs are grouped based on runs-on and image fields and then split (and balanced) into several runs based on test durations and target total time.
This is done to make efficient use of resources because there are many tests that last for just several seconds while preparation can take ~4 minutes.
So, tests are run in batches in the Run Test step with clear separation and summary. Lists of tests are displayed at the beginning, and one can search for test <number>
(number ranges from 1 to total number of tests) when needing to see test flow and test results for a particular test. Also, it is possible during development
to comment out tests in the JSON file of Test Matrix using the # character in a development branch and make test runs much faster, but
please do not forget to remove comments when a PR is created or finalized.
Test durations are collected after each push to main, and these are automatically used on each subsequent PR, Push, and other runs.