Implementation Notes¶
This section collects design documents and pipeline traces that describe how TT-Lang lowers operations from Python to hardware code. These are intended for contributors and anyone who needs to understand compiler internals.
Design Documents¶
DST Register Allocation — how the
TTLAssignDSTpass assigns destination registers to tile operationsDST Register Utilization — maximizing tile throughput per DST synchronization cycle
Lowering Pipeline Traces¶
These documents trace specific operations through the full compiler pipeline, from Python input through MLIR passes to generated C++ kernel code.
Multi-tile Compute Operations — traces a 2x2 multi-tile add through the pipeline