Programming Guide

This page covers compiler options, print debugging, performance tools, the simulator, and examples for TT-Lang operation development.

Compiler Options

Operations accept compiler options that control code generation (e.g., --no-ttl-maximize-dst, --no-ttl-fpu-binary-ops). These can be passed as command-line arguments, via the @ttl.operation decorator’s options= parameter, or the TTLANG_COMPILER_OPTIONS environment variable. Command-line arguments take highest priority.

# List available options
python examples/elementwise-tutorial/step_4_multinode_grid_auto.py --ttl-help

# Run an operation with options
python examples/elementwise-tutorial/step_4_multinode_grid_auto.py --no-ttl-maximize-dst

See the full compiler options reference for all decorator parameters, CompilerOptions flags with their MLIR pass mappings, environment variables, and ttlang-opt pass options.

Performance Tools

TT-Lang includes built-in performance analysis tools for profiling operations on hardware:

  • Perf Summary (TTLANG_PERF_DUMP=1) — NOC traffic and per-kernel wall time breakdown

  • Auto-Profiling (TTLANG_AUTO_PROFILE=1) — automatic per-line cycle count instrumentation

  • User-Defined Signposts (TTLANG_SIGNPOST_PROFILE=1) — targeted cycle counts for ttl.signpost() regions

  • Perfetto Trace Server (TTLANG_PERF_SERV=1) — visualize profiler data in the Perfetto UI

Performance tracing (Tracy) is enabled by default at build time. To disable it, configure with -DTTLANG_ENABLE_PERF_TRACE=OFF.

See the full performance tools reference for environment variable details, valid combinations, and sample output.

Simulator

See the Functional Simulator page for running kernels without hardware, debugging setup, and test commands.

Examples

See the examples/ and test/ directories for complete working examples, including:

  • test/python/simple_add.py

  • test/python/simple_fused.py

The tour provides an introduction to TT-Lang features.