Community · Open Source · Tenstorrent Ecosystem

A hidden dimension of Tenstorrent awesomeness

A curated directory of projects, tools, models, and research for Tenstorrent hardware — contributed by the community and our team. Browse by category or search across all entries.

107 Projects
12 Categories
Open Source
🚀 Getting Started
The essential first steps — installer, core SDKs, and guided onboarding
tt-metal
TT-NN operator library and TT-Metalium low-level kernel programming model. The primary SDK for devel…
🤖 AI & Models
Running, serving, and experimenting with AI models
tt-bio
Boltz-2 biomolecular model for drug discovery on Tenstorrent Blackhole. Supports single-card and mul…
🕵️ AI Agents
Agentic systems and AI assistants running on TT hardware
dstack
Vendor-agnostic orchestration for training, inference, and agentic workloads across NVIDIA, AMD, TPU…
⚙️ Custom Kernels & Low-Level
Metalium/tt-lang kernel authoring; anything sub-compiler
tt-tiny
Minimal Python code to access and program the Tenstorrent Blackhole chip directly — George Hotz's ex…
🔨 Compilers & Frontends
Getting PyTorch/JAX/ONNX/CUDA models onto TT hardware
BarraCUDA
Open-source CUDA compiler targeting multiple GPU architectures including Tenstorrent. Compiles .cu f…
🛠 Dev Tools & Debugging
Profiling, visualization, and debugging workloads
nvtop
htop-style process monitor for GPUs and AI accelerators. Supports AMD, Apple, Huawei, Intel, NVIDIA,…
🖥 Hardware & System
Drivers, firmware, monitoring, and hardware management
tt-kmd
Tenstorrent kernel module driver. The Linux kernel module required to interface with Tenstorrent PCI…
☁️ Cloud & Orchestration
Kubernetes, cloud deployment, and multi-node infrastructure
TT Console
Browser-based cloud console for exploring AI on Tenstorrent hardware. Run LLM inference, image and v…
🔩 RISC-V & Architecture
ISA, simulation, and running Linux on TT silicon
tt-bh-linux
Linux demo for the Tenstorrent Blackhole P100/P150 card RISC-V cores. Boot a real Linux kernel on th…
🔬 Research & Papers
Academic papers, theses, and HPC experiments
tt-tutorial (HPC)
Tutorial on Tenstorrent hardware for HPC researchers from the RISC-V Testbed project at Edinburgh/EP…
🎮 Games & Demos
Creative, playful, and proof-of-concept projects
tt-zork-and-more
A Tenstorrent fork of Infocom's Zork I (and more!), running a Z-machine interpreter at least four di…
📚 Guides, Tutorials & Education
Getting-started content, blog posts, lessons, courses
Programming Tenstorrent Processors
Deep-dive into the Tenstorrent architecture and Metalium programming model — circular buffers, kerne…
🚀 Getting Started
nvtop community 10739⭐
htop-style process monitor for GPUs and AI accelerators. Supports AMD, Apple, Huawei, Inte…
dstack community 2160⭐
Vendor-agnostic orchestration for training, inference, and agentic workloads across NVIDIA…
BarraCUDA community 1697⭐
Open-source CUDA compiler targeting multiple GPU architectures including Tenstorrent. Comp…
tt-tiny community 66⭐
Minimal Python code to access and program the Tenstorrent Blackhole chip directly — George…
tt-sim community 13⭐
Community-built Tenstorrent architecture simulator written in Python. Runs without hardwar…
tt-iree community 12⭐
IREE (Intermediate Representation Execution Environment) ML compiler ported to Tenstorrent…
triton-tenstorrent community 11⭐
OpenAI Triton compiler plugin for Tenstorrent hardware. Write Triton kernels and target Te…
bhx community 4⭐
Boot stock Linux cloud images on the SiFive X280 RISC-V cores inside Tenstorrent Blackhole…
tt-bio community
Boltz-2 biomolecular model for drug discovery on Tenstorrent Blackhole. Supports single-ca…
· Jan 31, 2026
Programming Tenstorrent Processors community
Deep-dive into the Tenstorrent architecture and Metalium programming model — circular buff…
· Apr 21, 2025
Tenstorrent SFPU Kernel Series — Jason Davies community
Sponsored series of deep technical articles on implementing optimal SFPU kernels for the T…
· Nov 12, 2025
Tenstorrent Blackhole Architecture Guide community
A 6,500-word community deep dive into the Blackhole p100a architecture: the tile model (Te…
· Feb 28, 2026
grayskull-attention community 38⭐
FlashAttention-style attention kernel implemented entirely in on-chip SRAM on the Tenstorr…
tt-twitch community 28⭐
A Tenstorrent Grayskull kernel written live on Twitch by George Hotz. 120-core grid demons…
koyeb/tenstorrent-examples community 18⭐
Example applications and deployment configurations for running AI workloads on Tenstorrent…
blackhole-py community 14⭐
Pure Python driver for Tenstorrent Blackhole cards providing direct low-level hardware acc…
tenstorrent-tiny-examples community 14⭐
Simple C++ kernel experiments on a GraySkull e75 chip. Hands-on examples for learning the …
ttnn-helloworld-cpp community 14⭐
Minimal working example of using Tenstorrent TTNN in C++. The simplest possible starting p…
TT-GoL community 12⭐
Conway's Game of Life implemented on Tenstorrent hardware using TT-Metal kernels.
ttMandelbrot community 7⭐
Mandelbrot Set fractal renderer running on Tenstorrent hardware. A classic demo showcasing…
TT-Metal Mini Template community 7⭐
Minimal working CMake project template for starting a new TT-Metal project from scratch. G…
tt-tutorial (HPC) community 7⭐
Tutorial on Tenstorrent hardware for HPC researchers from the RISC-V Testbed project at Ed…
ttPEAK community 6⭐
clpeak-style peak-performance benchmark for Tenstorrent devices using TT-Metalium. Measure…
tenstorrent.nix community 6⭐
Nix flake packaging the Tenstorrent software stack for NixOS and Nix users. Reproducible, …
current community 5⭐
High-level parallel programming framework for Tenstorrent accelerators, abstracting TT-Met…
ttVecAdd community 5⭐
Minimal vector-addition example on Tenstorrent devices using TT-Metalium. A clean hello-wo…
ttas community 4⭐
ttas is a hacker-friendly assembler/disassembler for Tensix on Wormhole. It turns assembly…
tt-tutorial (Korean) community 4⭐
Comprehensive tutorials for the Tenstorrent software stack in Korean. Jupyter notebooks co…
Collective Operations on Wormhole n150 (Sapienza University of Rome) community 4⭐
Master's thesis implementing and benchmarking five allreduce algorithms (Swing, Recursive …
libtt-metal-cxx community 2⭐
Rust crate that exposes the TT-Metal host API through a C++ bridge via cxx.rs — covering d…
gsplat_tt community 1⭐
Port of Gaussian Splatting (3D scene reconstruction from 2D images) to Tenstorrent hardwar…
A Gentle Guide: Tenstorrent Card on Arch Linux with Metalium community
Step-by-step guide to getting a Tenstorrent card running on Arch Linux with the full Metal…
· Jul 7, 2024
Thoughts and Logs After Messing with Tenstorrent Grayskull community
Honest field notes from getting a Grayskull card running and writing first Metalium kernel…
· Jun 2, 2024
Tenstorrent Architecture — W&M CSCI654 Advanced Computer Architecture community
Lecture 20 from William & Mary's graduate Computer Architecture course. Frames Tenstorrent…
· Oct 9, 2024
Attention in SRAM on Tenstorrent Grayskull community
A fused kernel for the Grayskull architecture implementing Transformer self-attention enti…
· Jul 18, 2024
Exploring Fast Fourier Transforms on the Tenstorrent Wormhole community
Ports the Cooley-Tukey FFT algorithm to the Wormhole n300 RISC-V accelerator. The Wormhole…
· Jun 18, 2025
Assessing Tenstorrent Grayskull RISC-V MatMul Acceleration for LLMs community
Evaluates the Tenstorrent Grayskull e75 RISC-V accelerator for matrix multiplication at re…
· May 9, 2025
Porting Strategies for Gravitational N-Body Simulations on Tenstorrent Wormhole community
Evaluates three strategies for scaling an N-body code across multiple Tenstorrent Wormhole…
· May 4, 2026
Accelerating Gravitational N-Body Simulations on Tenstorrent Wormhole community
Accelerates an astrophysical N-body simulation on the Wormhole n300. Achieves 2× speedup a…
Nov 16, 2025
Numerical Kernels on a Spatial Accelerator: Tenstorrent Wormhole community
Implements three numerical kernels and composes them into a conjugate gradient solver on W…
Mar 24, 2026
Accelerating Stencils on the Tenstorrent Grayskull RISC-V Accelerator community
Explores stencil computation on the Grayskull PCIe RISC-V accelerator. Early academic work…
Sep 27, 2024
Stencil Computations on Tenstorrent Wormhole community
Maps 2D 5-point stencil computations onto the Tenstorrent Wormhole RISC-V AI dataflow acce…
May 8, 2026
SwiftNPU: Scalable Shape-Flexible Allocation for Inter-Core Connected NPUs community
Makes multi-tenant NPU sharing practical for Blackhole-class hardware using polynomial-tim…
Apr 27, 2026
TileLoom: Automatic Dataflow Planning for Spatial Dataflow Accelerators community
Compiler system that automatically generates efficient dataflow plans for tile-based langu…
· Dec 17, 2025
Rewriting TTS Inference Economics: Lightning V2 on Tenstorrent vs. NVIDIA L40S community
Shows that Text-to-Speech inference on Tenstorrent Lightning V2 achieves 4× lower cost tha…
· Mar 24, 2026
tt-zork-and-more affiliated 2⭐
A Tenstorrent fork of Infocom's Zork I (and more!), running a Z-machine interpreter at lea…
Local AI Agents on Tenstorrent affiliated
Three agentic projects running fully on-device: local AI agents on QuietBox 2, a coding as…
Video Generation on Tenstorrent affiliated
Three lesson-projects covering on-device video synthesis: frame-by-frame diffusion with tt…
tensix-viz affiliated
Hardware topology visualizer for Tenstorrent chips — from individual chip to full cluster.…
Tenstorrent Cookbook: Particle Life Simulator affiliated
Particle Life simulation on Tenstorrent hardware — an emergent-behavior N-body system wher…
CS Fundamentals on Tenstorrent Hardware affiliated
Seven-module computer science curriculum taught on real Tenstorrent hardware. Covers RISC-…
tt-lang-models affiliated 7⭐
A growing collection of models that use tt-lang for some or all of their implementation. R…
tt-qb-lights affiliated 2⭐
Sync your Tenstorrent Quietbox's RGB lighting to accelerator utilization status. Visual fe…
gemma4 affiliated 1⭐
Gemma 4 language model implemented in tt-lang (e4b variant) for direct execution on Tensto…
open-oasis affiliated 1⭐
tt-lang inference script for Oasis 500M — an interactive video world model running on Tens…
tt-model-runner affiliated 1⭐
Discover, load, and benchmark models with a GUI and TUI for tt-inference-server. Makes exp…
tt-claw affiliated
A Tenstorrent-powered claw machine that rewards players with real prizes. The QuietBox 2 r…
dflash affiliated
DFlash: Block Diffusion for Flash Speculative Decoding on Tenstorrent hardware using tt-la…
diamond affiliated
DIAMOND: Atari game-playing agent implemented on Tenstorrent hardware via tt-lang. Diffusi…
Engram affiliated
A Tenstorrent port of the DeepSeek Engram model using tt-lang. Brings DeepSeek's memory-ef…
Stable Diffusion XL on Tenstorrent affiliated
On-device image generation with Stable Diffusion XL running entirely on Tenstorrent hardwa…
tt-forge-compiletron affiliated
Compile more than 100 models on tt-forge in a display format suitable for demos. Comprehen…
Image Classification with TT-Forge affiliated
End-to-end image classification project using TT-Forge — compile and run a PyTorch classif…
tt-warp affiliated
Warp terminal plugin for Tenstorrent — integrates hardware status, model management, and d…
Tensix Grid Playground affiliated
Interactive browser-based visualizer of the Tenstorrent Tensix grid architecture. Explore …
Tenstorrent Cookbook: Conway's Game of Life affiliated
TT-Metalium implementation of Conway's Game of Life as a cookbook recipe. Each generation …
Custom Model Training on Tenstorrent affiliated
Eight-lesson series covering the full custom training workflow on TT hardware: dataset fun…
Tenstorrent Cookbook: Core Recipes affiliated
Three hands-on TT-Metalium kernel recipes: a Mandelbrot fractal explorer, real-time audio …
tt-bh-linux official 55⭐
Linux demo for the Tenstorrent Blackhole P100/P150 card RISC-V cores. Boot a real Linux ke…
TT Console official
Browser-based cloud console for exploring AI on Tenstorrent hardware. Run LLM inference, i…
tt-metal official 1518⭐
TT-NN operator library and TT-Metalium low-level kernel programming model. The primary SDK…
tt-buda official 314⭐
TT-BUDA: Tenstorrent's original Python compiler and runtime for AI workloads. Legacy stack…
tt-forge official 289⭐
Tenstorrent's MLIR-based compiler frontend. Enables running AI workloads from PyTorch, ONN…
tt-mlir official 280⭐
Tenstorrent MLIR compiler — the core compiler infrastructure shared by tt-forge and other …
riscv-ocelot official 255⭐
The Berkeley Out-of-Order Machine with V-EXT (RISC-V Vector Extension) support. Tenstorren…
ttsim official 122⭐
Fast full-system simulator of Tenstorrent Wormhole and Blackhole hardware. Runs TT-Metaliu…
whisper official 88⭐
RISC-V Instruction Set Simulator (ISS) used by Tenstorrent for processor verification. Pow…
tt-xla official 68⭐
PJRT device plugin for Tenstorrent hardware. Enables JAX, PyTorch/XLA, and other XLA-based…
RiESCUE official 66⭐
RISC-V Directed Test Framework and Compliance Suite. Comprehensive test infrastructure for…
tt-kmd official apt* 65⭐
Tenstorrent kernel module driver. The Linux kernel module required to interface with Tenst…
tt-buda-demos official 64⭐
Repository of model demos using TT-Buda. The largest collection of pre-compiled model exam…
tt-forge-onnx official 64⭐
ONNX graph compiler for Tenstorrent hardware. Optimizes and transforms ONNX model graphs f…
tt-smi official pip 61⭐
Tenstorrent System Management Interface — monitor device telemetry, issue board-level rese…
tt-inference-server official 58⭐
Production-ready model serving for Tenstorrent hardware with OpenAI-compatible REST API. S…
ttnn-visualizer official pip 52⭐
Comprehensive tool for visualizing and analyzing model execution on Tenstorrent hardware. …
tt-llk official 52⭐
Tenstorrent Low-Level Kernels: the C++ library that directly programs the RISC-V cores ins…
Jun 5, 2025
tt-lang official pippip 51⭐
Python-based DSL that sits between TT-NN and TT-Metalium — expresses custom fused kernels …
TT-Studio official 48⭐
Web-based GUI for deploying and chatting with AI models on Tenstorrent hardware. Handles a…
WallaBMC official 46⭐
Lightweight BMC (Baseboard Management Controller) for STM32 and similar MCUs, with Web UI,…
tt-umd official 43⭐
User-mode driver for Tenstorrent hardware. The userspace layer that sits between the kerne…
tt-system-firmware official 39⭐
System firmware for Tenstorrent hardware. Low-level system initialization and control firm…
luwen official cargo 34⭐
Tenstorrent system interface library written in Rust. Low-level Rust bindings for communic…
tt-tvm official 31⭐
TVM for Tenstorrent ASICs. Brings the Apache TVM compiler stack to Tenstorrent hardware, e…
tensix-isa-simulator official 29⭐
ISA-level simulator for the Tensix compute engine. Simulates the matrix, vector, and scala…
tt-torch official 25⭐
Frontend integration for PyTorch with tt-mlir. Compile PyTorch models directly to Tenstorr…
tt-firmware official apt* 24⭐
Tenstorrent firmware repository. Board management and control firmware for Tenstorrent acc…
tt-installer official 23⭐
Install the complete Tenstorrent software stack with one command. Handles drivers, firmwar…
tt-exalens official pip 21⭐
Low-level hardware debugger for Tenstorrent devices. Inspect register state, memory conten…
tt-topology official pip 16⭐
Configure Ethernet routing on multi-card Tenstorrent systems. Flash NB cards to use specif…
tt-npe official 14⭐
Network-on-chip Performance Estimator for Tenstorrent Tensix-based devices. Model and esti…
tt-blacksmith official 13⭐
Optimized training recipes for a variety of ML models on Tenstorrent hardware, powered by …
tt-example-apps official 13⭐
End-to-end AI applications running on Tenstorrent AI accelerators. Complete application ex…
tt-flash official pip 13⭐
Tenstorrent firmware update utility. Flash new firmware onto Tenstorrent accelerator cards…
tt-vscode-toolkit official 7⭐
48 interactive lessons covering the full Tenstorrent developer path — from hardware detect…
Dec 18, 2025
tt-toplike official 2⭐
A vibrant htop-style visualizer for Tenstorrent hardware written in Rust. Real-time proces…
tt-local-generator official 1⭐
Generate infinite videos and images (and imaginative prompts to inspire them) on Tenstorre…
tt-animatediff official
Generates short, temporally coherent animated GIFs using the AnimateDiff model on Tenstorr…
🏷 Recent Releases
33 releases
ttsim official v1.8.3
2026-06-13T17:31:51Z
tt-inference-server official v0.16.0
2026-06-12T18:21:42Z
tt-smi official v5.3.0
2026-06-12T15:35:05Z
tt-system-firmware official v19.11.0
2026-06-11T14:56:55Z
tt-exalens official v0.3.23
2026-06-11T14:42:44Z
dstack community 0.20.24
2026-06-11T13:55:33Z
tt-animatediff official v0.6.0
2026-06-10T22:16:43Z
ttnn-visualizer official v0.89.0
2026-06-10T18:50:24Z
tensix-viz affiliated v1.1.0
2026-06-09T22:19:42Z
tt-local-generator official v0.7.4
2026-06-09T21:59:39Z
tt-vscode-toolkit official v0.0.465
2026-06-09T20:32:59Z
tt-kmd official ttkmd-2.9.0
2026-06-09T13:25:19Z
tt-metal official v0.72.0
2026-06-09T01:30:48Z
tt-toplike official v0.6.2
2026-06-08T19:42:59Z
tt-umd official v0.9.6
2026-06-03T10:59:12Z
tt-flash official v3.8.0
2026-06-01T18:04:27Z
BarraCUDA community v0.5.0
2026-05-29T04:30:28Z
tt-forge official 1.2.0
2026-05-28T09:59:56Z
tt-forge-onnx official 1.2.0
2026-05-28T09:57:37Z
tt-xla official 1.2.0
2026-05-28T09:50:59Z
ttas community v0.1.0
2026-05-28T07:08:35Z
TT-Studio official v2.6.0
2026-05-20T17:04:32Z
whisper official 1.861
2026-05-11T15:44:36Z
tt-sim community v1.0
2026-05-11T13:07:42Z
tt-bh-linux official v0.11
2026-04-13T15:10:59Z
luwen official v0.8.5
2026-03-30T21:03:56Z
tt-installer official v2.2.1
2026-03-16T18:54:29Z
tt-topology official v1.2.19
2026-02-26T21:14:41Z
tt-firmware official v19.6.0
2026-02-20T16:53:34Z
nvtop community 3.3.2
2026-02-08T17:57:16Z
RiESCUE official v1.7.0
2025-12-03T19:29:44Z
tt-torch official 0.4.0
2025-09-29T22:23:47Z
tt-buda official v0.19.3
2024-09-24T21:01:08Z

Select an entry to see details

nvtop

community★ featured
by Syllo · C · GPL-3.0 · 10739⭐ ·
nvtop preview

htop-style process monitor for GPUs and AI accelerators. Supports AMD, Apple, Huawei, Intel, NVIDIA, Qualcomm — and Tenstorrent. Real-time utilization, memory, and process info in a terminal UI.

📦 Repo
LATEST 3.3.2 2026-02-08T17:57:16Z Release notes ↗
4 previous releases
3.3.1 2026-01-18T13:12:34Z
3.3.0 2026-01-16T13:28:09Z
3.2.0 2025-03-29T11:26:44Z
3.1.0 2024-02-23T15:04:44Z
See all releases on GitHub ↗
monitoring tui htop process-monitor terminal
wormhole blackhole

dstack

community★ featured
by dstackai · Python · MPL-2.0 · 2160⭐ ·
dstack preview

Vendor-agnostic orchestration for training, inference, and agentic workloads across NVIDIA, AMD, TPU, and Tenstorrent on clouds, Kubernetes, and bare metal.

LATEST 0.20.24 2026-06-11T13:55:33Z Release notes ↗
4 previous releases
0.20.25rc1pre 2026-06-12T15:38:06Z
0.20.23 2026-06-04T10:20:34Z
0.20.22 2026-05-28T10:24:19Z
0.20.21 2026-05-21T12:43:40Z
See all releases on GitHub ↗
orchestration kubernetes cloud multi-vendor

BarraCUDA

community★ featured
by Zaneham · C · 1697⭐ ·

Open-source CUDA compiler targeting multiple GPU architectures including Tenstorrent. Compiles .cu files to run on AMD and Tenstorrent hardware without modification.

📦 Repo
LATEST v0.5.0 2026-05-29T04:30:28Z Release notes ↗
See all releases on GitHub ↗
cuda compiler cross-platform blackhole
blackhole

tt-tiny

community★ featured
by geohot · Python · 66⭐ ·

Minimal Python code to access and program the Tenstorrent Blackhole chip directly — George Hotz's exploration of TT hardware programmability with pointed commentary on the architecture.

📦 Repo
blackhole low-level exploration
blackhole

tt-sim

community★ featured
by mesham · Python · 13⭐ ·

Community-built Tenstorrent architecture simulator written in Python. Runs without hardware — useful for researchers and developers exploring the Tensix architecture offline.

📦 Repo
LATEST v1.0 2026-05-11T13:07:42Z Release notes ↗
See all releases on GitHub ↗
simulator architecture no-hardware research

tt-iree

community★ featured
by swote-git · C++ · Apache-2.0 · 12⭐ ·

IREE (Intermediate Representation Execution Environment) ML compiler ported to Tenstorrent AI accelerators. Brings the IREE compiler ecosystem to TT hardware.

📦 Repo
iree compiler mlir inference
wormhole blackhole

triton-tenstorrent

community★ featured
by kernelize-ai · C++ · 11⭐ ·

OpenAI Triton compiler plugin for Tenstorrent hardware. Write Triton kernels and target Tensix cores — brings the Triton ML kernel ecosystem to TT devices.

📦 Repo
triton openai-triton compiler kernels
wormhole blackhole

bhx

community★ featured
by olofj · Rust · 4⭐ ·

Boot stock Linux cloud images on the SiFive X280 RISC-V cores inside Tenstorrent Blackhole AI accelerators. Per-card Rust daemon with virtio-mmio block/net/console and U-Boot/EFI support.

📦 Repo
# Changelog

Notable changes per release. Format loosely follows
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/);
this project does not yet promise SemVer compatibility on the RPC
wire format or library API surface (we're not 1.0).

## Unreleased

V2 virtio-dispatch redesign. The kick ring + completion ring + host-
side throttle that grew up around #184 are gone; in their place is a
per-(slot, queue) dirty bitmap in BRISC L1. The bitmap is level-
sensitive — guest QUEUE_NOTIFY storms coalesce into a single set
byte, so the dispatch path can't fall behind under any burst. Wire
incompatible with 0.9.0; `TENSIX_PROTOCOL_VERSION` bumped 4 → 5.

### Added

- **V2 dirty-bitmap dispatch** (`#187` / `#188` / `#189`). BRISC
  writes 1 to `CTRL_OFF_DIRTY[slot][queue]` on every guest
  QUEUE_NOTIFY; the daemon's `Dispatcher` clears the byte and
  dispatches each pass. Replaces V1's 2048-entry kick ring +
  daemon-side `consume_kick_ring_pass` consumer.
- **V2 processed-cursor table** at `CTRL_OFF_PROCESSED`. Daemon
  publishes `used.idx` after each successful dispatch so
  warm-resume reads cursors directly without re-probing guest
  DRAM.
- **`bhx_notify_events_total`, `bhx_dispatch_passes_total`,
  `bhx_dispatch_queues_drained`** Prometheus counters surface the
  new dispatch path. The burst regression test (`scripts/
  soak_virtio_burst.py`) asserts `dispatch_passes_total > 0` to
  confirm the workload reached the new path.
- **`scripts/soak_virtio_burst.py`** — multi-queue burst regression
  test. Sustains 16-job direct=1 fio randwrite + a tight
  `printf` loop to `/dev/console`, samples `/metrics` every 1 s,
  and verifies the daemon log contains zero
  `kick.*drop|rescue|throttle.*ENGAGE` matches.
- **`DaemonState.chip_reset_this_session`** flag — gates
  `maybe_opportunistic_reset_board` so 4-way parallel cold boots
  reset the chip exactly once, not once per L2CPU. Without this
  the second-and-later resets blip the chip while earlier-booted
  L2CPUs hold mmap pages, SIGBUSing their workers.
- **`Dispatcher` (was `KickPoller`)** with documented testability
  seam (`CtrlL1Access` trait); `drain_dirty_bitmap` is unit-tested
  against an in-memory L1 fake covering all five visit/clear
  semantics cases plus the address-formula pins.

### Changed

- **`KickPoller` → `Dispatcher`**, plus `kick_poller` → `dispatcher`
  field on `DaemonState`, `tensix-kick-poller` → `tensix-dispatcher`
  thread name, `[kick-poller]` → `[dispatcher]` log tag,
  `kicks_consumed` → `dispatches_total`,
  `last_kick_slot_queue` → `last_dispatch_slot_queue`. Pure
  rename; no behavior change. V1 vocabulary scrubbed throughout
  the codebase (firmware, daemon, scripts, docs).
- **`CTRL_SIZE` shrinks 36 KiB → 4 KiB**. V2 footprint is ~1.5 KiB;
  the rest is reserved for future fields.
- **Stats-page offsets repacked** — V1 `STATS_OFF_KICK_DROPS`,
  `STATS_OFF_COMPL_EVENTS`, `STATS_OFF_LAST_COMPL` retired with
  V1 (#190); deprecated PRECAP / BLINDCAP / POSTCAP slots dropp
blackhole risc-v linux boot virtio
blackhole

tt-bio

community★ featured
by moritztng · Python · MIT · Jan 31, 2026

Boltz-2 biomolecular model for drug discovery on Tenstorrent Blackhole. Supports single-card and multi-card configurations — QuietBox (4×) and Galaxy (32×). Approaches physics-based FEP accuracy at 1000× the speed.

drug-discovery blackhole inference biology multi-card
blackhole quietbox galaxy

Programming Tenstorrent Processors

community★ featured
by · Apr 21, 2025

Deep-dive into the Tenstorrent architecture and Metalium programming model — circular buffers, kernel synchronization, NoC routing, and where the footguns are. The honest guide to thinking in Tensix.

metalium programming-model tensix noc circular-buffers blog
wormhole blackhole

Tenstorrent SFPU Kernel Series — Jason Davies

community★ featured
by jasondavies · Nov 12, 2025

Sponsored series of deep technical articles on implementing optimal SFPU kernels for the Tenstorrent Wormhole and Blackhole vector units. Covers where, typecasting, 16/32-bit integer multiplication, cube root, and accurate sin/cos/tan — with cycle counts, assembly walkthroughs, and Blackhole vs Wormhole comparisons throughout.

sfpu assembly vector-unit cycle-counting wormhole blackhole optimization sponsored
wormhole blackhole

Tenstorrent Blackhole Architecture Guide

community★ featured
by · Feb 28, 2026

A 6,500-word community deep dive into the Blackhole p100a architecture: the tile model (Tensix, DRAM, SiFive x280 L2CPU, Ethernet, PCIe, NoC arc), firmware startup sequence, MOP micro-op processor, replay buffer, FPU/SFPU sync, and the anatomy of a kernel. From the author of blackhole-py.

blackhole architecture tensix noc sifive-x280 firmware mop sfpu deep-dive blog
blackhole

grayskull-attention

community
by moritztng · TeX · MIT · 38⭐ ·

FlashAttention-style attention kernel implemented entirely in on-chip SRAM on the Tenstorrent Grayskull chip using TT-Metalium. Pioneering work in low-level attention on TT hardware.

📦 Repo
attention grayskull metalium sram kernel
grayskull

tt-twitch

community
by geohot · C++ · 28⭐ ·

A Tenstorrent Grayskull kernel written live on Twitch by George Hotz. 120-core grid demonstration of live kernel programming.

📦 Repo
grayskull kernel live-coding demo
grayskull

koyeb/tenstorrent-examples

community
by koyeb · Dockerfile · 18⭐ ·

Example applications and deployment configurations for running AI workloads on Tenstorrent hardware via Koyeb's cloud platform.

cloud koyeb deployment examples

blackhole-py

community
by boopdotpng · Python · MIT · 14⭐ ·

Pure Python driver for Tenstorrent Blackhole cards providing direct low-level hardware access without going through the full TT-Metal stack.

📦 Repo
driver python blackhole low-level hardware-access
blackhole

tenstorrent-tiny-examples

community
by jaebaek · C++ · 14⭐ ·

Simple C++ kernel experiments on a GraySkull e75 chip. Hands-on examples for learning the TT-Metal programming model at the metal level.

📦 Repo
examples grayskull cpp learning
grayskull

ttnn-helloworld-cpp

community
by marty1885 · C++ · 14⭐ ·

Minimal working example of using Tenstorrent TTNN in C++. The simplest possible starting point for C++ developers targeting TT hardware with TTNN.

📦 Repo
c++ ttnn hello-world template
wormhole blackhole

TT-GoL

community
by JushBJJ · C++ · 12⭐ ·
TT-GoL preview

Conway's Game of Life implemented on Tenstorrent hardware using TT-Metal kernels.

📦 Repo
game-of-life demo kernels

ttMandelbrot

community
by marty1885 · C · 0BSD · 7⭐ ·

Mandelbrot Set fractal renderer running on Tenstorrent hardware. A classic demo showcasing parallel compute on Tensix cores.

📦 Repo
mandelbrot demo fractals parallel

TT-Metal Mini Template

community
by JushBJJ · C++ · 7⭐ ·

Minimal working CMake project template for starting a new TT-Metal project from scratch. Good starting point for community kernel development.

📦 Repo
template cmake starter boilerplate

tt-tutorial (HPC)

community
by RISCVtestbed · C++ · BSD-3-Clause · 7⭐ ·

Tutorial on Tenstorrent hardware for HPC researchers from the RISC-V Testbed project at Edinburgh/EPCC. Covers Wormhole from an HPC parallel-computing perspective.

📦 Repo
tutorial hpc epcc edinburgh wormhole
wormhole

ttPEAK

community
by TT-Bounty-Hunters · C++ · ISC · 6⭐ ·

clpeak-style peak-performance benchmark for Tenstorrent devices using TT-Metalium. Measures theoretical peak throughput across operations — useful for hardware characterization.

📦 Repo
benchmark performance clpeak metalium
wormhole blackhole

tenstorrent.nix

community
by RossComputerGuy · Nix · LGPL-2.1 · 6⭐ ·

Nix flake packaging the Tenstorrent software stack for NixOS and Nix users. Reproducible, declarative installation of TT drivers and tools.

📦 Repo
nix nixos packaging flake reproducible

current

community
by seansiddens · C++ · 5⭐ ·

High-level parallel programming framework for Tenstorrent accelerators, abstracting TT-Metal into a research-oriented programming model for parallel computation.

📦 Repo
framework parallel abstraction research
wormhole blackhole

ttVecAdd

community
by marty1885 · C++ · ISC · 5⭐ ·

Minimal vector-addition example on Tenstorrent devices using TT-Metalium. A clean hello-world for the TT-Metal kernel programming model in C++.

📦 Repo
vector-add example metalium hello-world

ttas

community
by Zaneham · C · Apache-2.0 · 4⭐ ·

ttas is a hacker-friendly assembler/disassembler for Tensix on Wormhole. It turns assembly into the exact 32-bit words the hardware runs, and turns binaries back into readable instructions using the same shared instruction table.

📦 Repo
LATEST v0.1.0 2026-05-28T07:08:35Z Release notes ↗
1 previous release
v0.0.1 2026-05-27T15:19:11Z
See all releases on GitHub ↗
assembler
wormhole

tt-tutorial (Korean)

community
by changh95 · Jupyter Notebook · 4⭐ ·

Comprehensive tutorials for the Tenstorrent software stack in Korean. Jupyter notebooks covering the full developer path from hardware setup to model inference.

📦 Repo
tutorial korean jupyter getting-started
wormhole

Collective Operations on Wormhole n150 (Sapienza University of Rome)

community

Master's thesis implementing and benchmarking five allreduce algorithms (Swing, Recursive Doubling, Bandwidth Optimal, Latency Optimal, Shared Memory) on the Wormhole n150. Bandwidth Optimal achieved best performance, approaching within 2× of theoretical optimal.

📦 Repo
allreduce collective-ops wormhole mpi bandwidth
wormhole

libtt-metal-cxx

community
by Knight-Ops · Rust · 2⭐ ·

Rust crate that exposes the TT-Metal host API through a C++ bridge via cxx.rs — covering device management, program/kernel creation (from source file or inline string), circular buffers, semaphores, runtime arguments, sharded buffers, and MeshDevice workflows, with hardware-backed integration tests.

📦 Repo
rust bindings cxx tt-metal ffi host-api
wormhole blackhole

gsplat_tt

community
by Kovelja009 · Python · 1⭐ ·

Port of Gaussian Splatting (3D scene reconstruction from 2D images) to Tenstorrent hardware.

📦 Repo
gaussian-splatting computer-vision 3d-reconstruction blackhole
blackhole

A Gentle Guide: Tenstorrent Card on Arch Linux with Metalium

community
by · Jul 7, 2024

Step-by-step guide to getting a Tenstorrent card running on Arch Linux with the full Metalium stack. Practical troubleshooting from someone who did it the hard way first.

arch-linux metalium installation blog getting-started
grayskull wormhole

Thoughts and Logs After Messing with Tenstorrent Grayskull

community
by · Jun 2, 2024

Honest field notes from getting a Grayskull card running and writing first Metalium kernels. Covers setup pitfalls, processor hangs, memory protection quirks, and what makes Metalium compelling despite early rough edges.

grayskull metalium getting-started blog honest-review
grayskull

Tenstorrent Architecture — W&M CSCI654 Advanced Computer Architecture

community
by · Oct 9, 2024

Lecture 20 from William & Mary's graduate Computer Architecture course. Frames Tenstorrent in the landscape between GPUs and TPUs, draws comparisons to Cerebras and SambaNova, then dives deep into the Wormhole chip and Tensix core: the 5 RISC-V core design, SFPU, NoC, and dataflow execution model.

lecture architecture wormhole tensix risc-v sfpu noc academia
wormhole

Attention in SRAM on Tenstorrent Grayskull

community
by · Jul 18, 2024

A fused kernel for the Grayskull architecture implementing Transformer self-attention entirely within SRAM. Combines matrix multiply, attention score scaling, and Softmax without DRAM accesses, achieving significant speedups over non-fused implementations.

attention transformer sram grayskull kernel risc-v
grayskull

Exploring Fast Fourier Transforms on the Tenstorrent Wormhole

community
by · Jun 18, 2025

Ports the Cooley-Tukey FFT algorithm to the Wormhole n300 RISC-V accelerator. The Wormhole draws 8× less power and consumes 2.8× less energy than a 24-core Xeon Platinum for a 2D FFT. ISC 2025.

fft wormhole hpc risc-v energy-efficiency epcc
wormhole

Assessing Tenstorrent Grayskull RISC-V MatMul Acceleration for LLMs

community
by · May 9, 2025

Evaluates the Tenstorrent Grayskull e75 RISC-V accelerator for matrix multiplication at reduced numerical precision (BFP8 and LoFi), a fundamental kernel in LLM inference computation.

matmul grayskull risc-v bfp8 lofi llm precision
grayskull

Porting Strategies for Gravitational N-Body Simulations on Tenstorrent Wormhole

community
by · May 4, 2026

Evaluates three strategies for scaling an N-body code across multiple Tenstorrent Wormhole accelerators. Builds on the established performance of single-card N-body work to explore parallelism via the on-chip NoC and multi-accelerator configurations.

n-body astrophysics hpc wormhole risc-v multi-accelerator simulation
wormhole

Accelerating Gravitational N-Body Simulations on Tenstorrent Wormhole

community
Nov 16, 2025

Accelerates an astrophysical N-body simulation on the Wormhole n300. Achieves 2× speedup and 2× energy savings over a highly optimized CPU implementation. SC '25 Workshop.

n-body astrophysics hpc wormhole risc-v simulation
wormhole

Numerical Kernels on a Spatial Accelerator: Tenstorrent Wormhole

community
Mar 24, 2026

Implements three numerical kernels and composes them into a conjugate gradient solver on Wormhole. Demonstrates AI accelerators merit consideration for HPC workloads traditionally dominated by CPUs and GPUs. 2026.

numerical-methods hpc conjugate-gradient wormhole sparse
wormhole

Accelerating Stencils on the Tenstorrent Grayskull RISC-V Accelerator

community
Sep 27, 2024

Explores stencil computation on the Grayskull PCIe RISC-V accelerator. Early academic work examining TT hardware for HPC stencil workloads. 2024.

stencil hpc grayskull risc-v
grayskull

Stencil Computations on Tenstorrent Wormhole

community
May 8, 2026

Maps 2D 5-point stencil computations onto the Tenstorrent Wormhole RISC-V AI dataflow accelerator via two implementations: element-wise decomposition (Axpy) and matrix-multiplication reformulation (MatMul). Profiling shows the isolated Wormhole kernel is competitive with CPU execution, with PCIe transfers and initialization driving end-to-end overhead; Axpy achieves lower energy than the CPU baseline at large scales. Identifies architectural and software directions for making AI accelerators viable for HPC stencil workloads. 2025.

stencil hpc wormhole risc-v energy-efficiency benchmarks dataflow
wormhole

SwiftNPU: Scalable Shape-Flexible Allocation for Inter-Core Connected NPUs

community
Apr 27, 2026

Makes multi-tenant NPU sharing practical for Blackhole-class hardware using polynomial-time allocation algorithms. Delivers up to 1.37× higher utilization and 1.14× faster workload completion. Up to 890,000× faster than NP-hard baselines.

multi-tenant allocation blackhole npu scheduling
blackhole

TileLoom: Automatic Dataflow Planning for Spatial Dataflow Accelerators

community
by · Dec 17, 2025

Compiler system that automatically generates efficient dataflow plans for tile-based languages on spatial accelerators including Tenstorrent Wormhole. Exploits on-chip network forwarding between processing elements to reduce DRAM pressure.

compiler dataflow spatial-accelerator tile-based on-chip-network wormhole
wormhole

Rewriting TTS Inference Economics: Lightning V2 on Tenstorrent vs. NVIDIA L40S

community
by · Mar 24, 2026

Shows that Text-to-Speech inference on Tenstorrent Lightning V2 achieves 4× lower cost than NVIDIA L40S. Applies BlockFloat8 (BFP8) and low-fidelity (LoFi) precision strategies to TTS despite their greater numerical fragility compared to LLMs.

tts text-to-speech inference bfp8 lofi cost-efficiency precision
wormhole

tt-zork-and-more

affiliated ⑂ historicalsource/zork1★ featured
by tsingletaryTT · Python · 2⭐ ·
tt-zork-and-more preview

A Tenstorrent fork of Infocom's Zork I (and more!), running a Z-machine interpreter at least four different ways on TT hardware. The most fun you can have with an AI accelerator.

zork z-machine interactive-fiction demo fun

Local AI Agents on Tenstorrent

affiliated★ featured
by ·

Three agentic projects running fully on-device: local AI agents on QuietBox 2, a coding assistant powered by Aider against a local inference server, and the OpenClaw AI assistant on QuietBox 2. No cloud APIs — all inference runs on TT hardware.

agents local-llm aider coding-assistant quietbox on-device
wormhole blackhole quietbox

Video Generation on Tenstorrent

affiliated★ featured
by ·

Three lesson-projects covering on-device video synthesis: frame-by-frame diffusion with tt-local-generator, native AnimateDiff video animation, and video generation on QuietBox 2. All run entirely on TT hardware with no cloud dependency.

video-generation diffusion animatediff tt-local-generator quietbox on-device
wormhole blackhole quietbox

tensix-viz

affiliated★ featured
by tsingletaryTT · JavaScript ·
tensix-viz preview

Hardware topology visualizer for Tenstorrent chips — from individual chip to full cluster. Interactive JavaScript visualization of Tensix core layout and NoC connections.

LATEST v1.1.0 2026-06-09T22:19:42Z Release notes ↗
See all releases on GitHub ↗
# Changelog

All notable changes to tensix-viz are documented here.

## [1.1.0] - 2026-06-09

### Fixed

- **Heatmap: non-tensix cells no longer painted by heat overlay** (`src/chip.js` `_drawHeatmap`)
  Commit 76dca80 added `coreType !== 'tensix'` guards to the pre-built artifacts but never to
  the source. The guards are now in `src/chip.js` so the next build preserves them. Without this
  fix, DRAM (col 5 on Wormhole), ETH (row 6 on Wormhole), and PCIe (col 8 on Blackhole) cells
  were colored by the heatmap overlay and could inflate `maxVal`, compressing the visible range
  for all tensix cells.

- **Memory overlay: stale phase not rendered after `reset()` on `showMemory: true` instances**
  (`src/chip.js` `reset()` and constructor)
  After calling `viz.activate(mode)` followed by `viz.reset()` on a canvas created with
  `showMemory: true`, `_memPhase` retained the frozen `_mem` object from the animation closure.
  `reset()` calls `render()` at the end, which caused `_drawMemoryLayer()` to run with stale data,
  producing a faint DRAM glow and L1 fill bars on an otherwise blank chip. `reset()` now sets
  `this._memPhase = null`; the field is also explicitly initialized to `null` in the constructor.

- **Canvas context: `getContext('2d')` moved to after canvas sizing**
  (`src/chip.js` constructor)
  The 2D context was obtained before `canvas.width`/`canvas.height` were assigned. Assigning to
  `canvas.width` resets all context state per spec, making the early `getContext` call redundant
  and inconsistent with the intent. `this.ctx` is now assigned after the sizing block so the
  obtained context reflects the final dimensions.

### Added

- **Responsive canvas sizing** (`src/chip.js` constructor)
  If `canvas.parentElement` exists and `clientWidth` is smaller than the canvas's intrinsic
  `width` attribute, logical dimensions are capped to the container width and height is scaled
  proportionally. Applies at construction time; re-create the instance for later resizes.

- **Float label boundary clamping** (overridden `render()`)
  The floating tooltip label is now clamped so its pill box never overflows any canvas edge.
  `rawCx`/`rawCy` are constrained by `Math.max(w/2+margin, Math.min(logicalW-w/2-margin, raw*))`.

## [1.0.0] - 2026-05-18

Initial public release.
visualization topology noc hardware
wormhole blackhole
Blackhole · P100 / P150 / P300c · 140 Tensix cores
Wormhole · N150 / N300 · 64 Tensix cores
mode

Tenstorrent Cookbook: Particle Life Simulator

affiliated★ featured
by ·

Particle Life simulation on Tenstorrent hardware — an emergent-behavior N-body system where simple attraction/repulsion rules between species produce complex lifelike patterns. Cookbook recipe demonstrating parallel N-body compute on Tensix.

particle-life n-body simulation emergent cookbook demo
wormhole blackhole

CS Fundamentals on Tenstorrent Hardware

affiliated★ featured
by ·

Seven-module computer science curriculum taught on real Tenstorrent hardware. Covers RISC-V architecture, memory hierarchy, parallel computing, networks and NoC, synchronization, abstraction layers, and computational complexity — all grounded in what is physically happening on the chip.

computer-science curriculum risc-v parallelism memory noc education
wormhole blackhole

tt-lang-models

affiliated
by zoecarver · Python · 7⭐ ·
tt-lang-models preview

A growing collection of models that use tt-lang for some or all of their implementation. Reference implementations for bringing modern models to the tt-lang DSL.

📦 Repo
tt-lang models dsl reference

tt-qb-lights

affiliated
by tsingletaryTT · Rust · 2⭐ ·

Sync your Tenstorrent Quietbox's RGB lighting to accelerator utilization status. Visual feedback for hardware activity in real time.

📦 Repo
quietbox rgb hardware fun
quietbox

gemma4

affiliated
by zoecarver · Python · 1⭐ ·

Gemma 4 language model implemented in tt-lang (e4b variant) for direct execution on Tenstorrent hardware.

📦 Repo
gemma llm tt-lang inference
blackhole

open-oasis

affiliated ⑂ etched-ai/open-oasis
by zoecarver · Python · 1⭐ ·

tt-lang inference script for Oasis 500M — an interactive video world model running on Tenstorrent hardware via the tt-lang DSL.

📦 Repo
video world-model oasis tt-lang inference
blackhole

tt-model-runner

affiliated
by tsingletaryTT · Python · 1⭐ ·

Discover, load, and benchmark models with a GUI and TUI for tt-inference-server. Makes exploring available models on Tenstorrent hardware as easy as browsing a catalog.

📦 Repo
gui tui models inference benchmark
wormhole blackhole quietbox

tt-claw

affiliated
by tsingletaryTT · Shell ·

A Tenstorrent-powered claw machine that rewards players with real prizes. The QuietBox 2 runs local AI inference to act as an agent controlling the claw hardware — the OpenClaw AI assistant lesson builds directly on this project.

claw-machine agents hardware quietbox physical on-device
quietbox

dflash

affiliated ⑂ z-lab/dflash
by zoecarver · Python ·

DFlash: Block Diffusion for Flash Speculative Decoding on Tenstorrent hardware using tt-lang. Combines block diffusion with speculative decoding for faster inference.

speculative-decoding diffusion tt-lang inference

diamond

affiliated ⑂ eloialonso/diamond
by zoecarver · Python ·
diamond preview

DIAMOND: Atari game-playing agent implemented on Tenstorrent hardware via tt-lang. Diffusion-based world model for reinforcement learning.

atari reinforcement-learning world-model tt-lang

Engram

affiliated ⑂ deepseek-ai/Engram
by zoecarver · Python ·

A Tenstorrent port of the DeepSeek Engram model using tt-lang. Brings DeepSeek's memory-efficient architecture to TT hardware.

📦 Repo
deepseek engram tt-lang inference
blackhole

Stable Diffusion XL on Tenstorrent

affiliated
by ·

On-device image generation with Stable Diffusion XL running entirely on Tenstorrent hardware. Full inference pipeline with no cloud dependency.

stable-diffusion sdxl image-generation diffusion on-device
wormhole blackhole

tt-forge-compiletron

affiliated
by tsingletaryTT · Python ·
tt-forge-compiletron preview

Compile more than 100 models on tt-forge in a display format suitable for demos. Comprehensive showcase of tt-forge model compatibility.

📦 Repo
tt-forge models demo compilation

Image Classification with TT-Forge

affiliated
by ·

End-to-end image classification project using TT-Forge — compile and run a PyTorch classification model on Tenstorrent hardware with no kernel authoring required.

forge image-classification pytorch compiler inference
wormhole blackhole

tt-warp

affiliated
by tsingletaryTT · Python ·

Warp terminal plugin for Tenstorrent — integrates hardware status, model management, and developer workflows directly into the Warp terminal.

📦 Repo
warp terminal plugin developer-experience

Tensix Grid Playground

affiliated
by ·

Interactive browser-based visualizer of the Tenstorrent Tensix grid architecture. Explore the NoC, core layout, and dataflow patterns without hardware — a great companion for learning kernel programming.

visualization interactive noc tensix browser architecture

Tenstorrent Cookbook: Conway's Game of Life

affiliated
by ·

TT-Metalium implementation of Conway's Game of Life as a cookbook recipe. Each generation is a full parallel kernel dispatch over the grid — a clean introduction to stateful compute on Tensix cores.

game-of-life demo cookbook parallel metalium
wormhole blackhole

Custom Model Training on Tenstorrent

affiliated
by ·

Eight-lesson series covering the full custom training workflow on TT hardware: dataset fundamentals, configuration patterns, fine-tuning, multi-device distributed training, experiment tracking, model architecture basics, and training from scratch.

training fine-tuning multi-device distributed experiment-tracking curriculum
wormhole blackhole

Tenstorrent Cookbook: Core Recipes

affiliated
by ·

Three hands-on TT-Metalium kernel recipes: a Mandelbrot fractal explorer, real-time audio signal processing pipeline, and custom image filter stack. Each recipe is a complete kernel project with full source in the lesson.

cookbook mandelbrot audio image-processing metalium demo
wormhole blackhole

tt-bh-linux

official★ featured
C · GPL-2.0 · 55⭐ ·
tt-bh-linux preview

Linux demo for the Tenstorrent Blackhole P100/P150 card RISC-V cores. Boot a real Linux kernel on the 16 high-performance RISC-V cores built into the Blackhole chip.

LATEST v0.11 2026-04-13T15:10:59Z Release notes ↗
4 previous releases
v0.10 2026-02-11T22:41:22Z
v0.9 2025-10-14T20:56:23Z
v0.5 2025-10-01T15:40:57Z
v0.4 2025-08-09T18:05:10Z
See all releases on GitHub ↗
linux risc-v blackhole bare-metal boot
blackhole

TT Console

official★ featured

Browser-based cloud console for exploring AI on Tenstorrent hardware. Run LLM inference, image and video generation, and browse the supported model catalog in-browser — backed by Tenstorrent accelerators. Cloud hardware access and advanced workflows (deployments, agents) available in staged rollout.

cloud console inference playground llm image-generation video-generation demo
wormhole blackhole

tt-metal

official
C++ · Apache-2.0 · 1518⭐ ·
tt-metal preview

TT-NN operator library and TT-Metalium low-level kernel programming model. The primary SDK for developing on Tenstorrent hardware — from high-level tensor ops to bare-metal RISC-V kernels.

metalium ttnn sdk kernels core
grayskull wormhole blackhole ttsim

tt-buda

official
Python · Apache-2.0 · 314⭐ ·

TT-BUDA: Tenstorrent's original Python compiler and runtime for AI workloads. Legacy stack — tt-forge is the recommended successor, but tt-buda has the largest model demo library.

📦 Repo
legacy compiler pytorch buda
grayskull wormhole

tt-forge

official
Python · Apache-2.0 · 289⭐ ·
tt-forge preview

Tenstorrent's MLIR-based compiler frontend. Enables running AI workloads from PyTorch, ONNX, and other frameworks on all Tenstorrent hardware configurations through an open-source, general, and performant compiler.

LATEST 1.2.0 2026-05-28T09:59:56Z Release notes ↗
5 previous releases
1.3.0.dev20260615003539 2026-06-15T01:20:39Z
1.3.0.dev20260614003409 2026-06-14T01:53:07Z
1.3.0.dev20260613003624 2026-06-13T01:28:10Z
1.3.0.dev20260609002802 2026-06-09T01:16:05Z
1.3.0.dev20260607003211 2026-06-07T01:27:37Z
See all releases on GitHub ↗
mlir compiler pytorch onnx frontend
wormhole blackhole ttsim

tt-mlir

official
C++ · Apache-2.0 · 280⭐ ·

Tenstorrent MLIR compiler — the core compiler infrastructure shared by tt-forge and other frontends. Handles graph optimization, lowering, and code generation for Tensix hardware.

5 releases
0.9.0.dev20260221pre 2026-02-21T04:31:50Z
0.9.0.dev20260220pre 2026-02-20T04:34:35Z
0.9.0.dev20260219pre 2026-02-19T04:37:24Z
0.9.0.dev20260218pre 2026-02-18T04:38:21Z
0.9.0.dev20260217pre 2026-02-17T04:37:09Z
See all releases on GitHub ↗
mlir compiler backend optimization
wormhole blackhole

riscv-ocelot

official ⑂ riscv-boom/riscv-boom
SystemVerilog · Apache-2.0 · 255⭐ ·
riscv-ocelot preview

The Berkeley Out-of-Order Machine with V-EXT (RISC-V Vector Extension) support. Tenstorrent's research-grade out-of-order RISC-V core with vector extension.

📦 Repo
risc-v out-of-order vector-extension processor-design

ttsim

official
C++ · Apache-2.0 · 122⭐ ·

Fast full-system simulator of Tenstorrent Wormhole and Blackhole hardware. Runs TT-Metalium workloads on any Linux/x86_64 system without physical silicon. Bit-exact results relative to hardware.

LATEST v1.8.3 2026-06-13T17:31:51Z Release notes ↗
4 previous releases
v1.8.2 2026-06-11T20:20:19Z
v1.8.1 2026-06-10T17:33:34Z
v1.8.0 2026-06-09T17:23:20Z
v1.7.3 2026-06-05T22:44:41Z
See all releases on GitHub ↗
simulator no-hardware bit-exact wormhole blackhole
ttsim

whisper

official ⑂ chipsalliance/VeeR-ISS
C++ · Apache-2.0 · 88⭐ ·

RISC-V Instruction Set Simulator (ISS) used by Tenstorrent for processor verification. Powers the co-simulation architecture checker.

📦 Repo
LATEST 1.861 2026-05-11T15:44:36Z Release notes ↗
See all releases on GitHub ↗
risc-v iss simulator verification

tt-xla

official
Python · Apache-2.0 · 68⭐ ·

PJRT device plugin for Tenstorrent hardware. Enables JAX, PyTorch/XLA, and other XLA-based frameworks to target TT accelerators.

LATEST 1.2.0 2026-05-28T09:50:59Z Release notes ↗
5 previous releases
1.3.0.dev20260615003539 2026-06-15T01:13:05Z
1.3.0.dev20260614003409 2026-06-14T01:44:58Z
1.3.0.dev20260613003624 2026-06-13T01:16:13Z
1.3.0.dev20260612003634 2026-06-12T01:14:41Z
1.3.0.dev20260611003559 2026-06-11T01:13:24Z
See all releases on GitHub ↗
xla pjrt jax pytorch
wormhole blackhole

RiESCUE

official
Python · Apache-2.0 · 66⭐ ·

RISC-V Directed Test Framework and Compliance Suite. Comprehensive test infrastructure for verifying RISC-V processor implementations against the specification.

LATEST v1.7.0 2025-12-03T19:29:44Z Release notes ↗
4 previous releases
v1.5.0 2025-11-17T21:58:14Z
v1.3.0 2025-11-06T20:12:13Z
v1.1.2 2025-10-16T17:21:43Z
v0.2.5 2025-07-10T00:59:12Z
See all releases on GitHub ↗
risc-v testing compliance verification

tt-kmd

official
C · GPL-2.0 · 65⭐ ·

Tenstorrent kernel module driver. The Linux kernel module required to interface with Tenstorrent PCIe accelerator cards.

📦 Repo
LATEST ttkmd-2.9.0 2026-06-09T13:25:19Z Release notes ↗
4 previous releases
ttkmd-2.9.99-testingpre 2026-06-12T17:39:45Z
ttkmd-2.9.0-rc1pre 2026-05-26T19:10:10Z
ttkmd-2.8.0 2026-04-06T18:58:39Z
ttkmd-2.8.0-rc1pre 2026-04-04T01:30:29Z
See all releases on GitHub ↗
kernel-module driver linux pcie
grayskull wormhole blackhole

tt-buda-demos

official
Python · Apache-2.0 · 64⭐ ·

Repository of model demos using TT-Buda. The largest collection of pre-compiled model examples for Tenstorrent hardware — BERT, ResNet, YOLO, GPT-2, Whisper, and many more.

📦 Repo
demos models bert resnet yolo gpt2
grayskull wormhole

tt-forge-onnx

official
Python · Apache-2.0 · 64⭐ ·
tt-forge-onnx preview

ONNX graph compiler for Tenstorrent hardware. Optimizes and transforms ONNX model graphs for efficient execution on Tensix accelerators. Used as a backend by tt-forge for ONNX model ingestion.

📦 Repo
LATEST 1.2.0 2026-05-28T09:57:37Z Release notes ↗
5 previous releases
1.3.0.dev20260615011951 2026-06-15T01:41:50Z
1.3.0.dev20260614012704 2026-06-14T01:45:56Z
1.3.0.dev20260613011732 2026-06-13T01:37:06Z
1.3.0.dev20260612012108 2026-06-12T01:42:56Z
1.3.0.dev20260611014638 2026-06-11T02:07:54Z
See all releases on GitHub ↗
onnx compiler graph-optimization mlir
wormhole blackhole

tt-smi

official
Python · Apache-2.0 · 61⭐ ·
tt-smi preview

Tenstorrent System Management Interface — monitor device telemetry, issue board-level resets, and inspect hardware health. The nvidia-smi equivalent for Tenstorrent hardware.

📦 Repo
LATEST v5.3.0 2026-06-12T15:35:05Z Release notes ↗
4 previous releases
v5.2.0 2026-05-14T17:26:26Z
v5.1.1 2026-05-12T22:18:05Z
v5.1.0 2026-05-11T16:23:13Z
v5.0.1 2026-04-24T11:39:48Z
See all releases on GitHub ↗
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## 3.0.26 - 29/07/25
- Added single tray galaxy reset option
- Bumped luwen from 0.7.5 -> 0.7.10
  - Chip detect now doesn't wait for eth to train for the 6U galaxy's, allowing multi tray resets to happen independently
- Updated readme with the new reset option

## 3.0.25 - 29/07/25
- Added packaging

## 3.0.24 - 04/07/25
- Now users have 2 galay reset modes available
  - glx_reset: resets the galaxy, informs users if there has been an eth failure
  - glx_reset_auto: resets the galaxy upto 3 times if eth failures are detected

## 3.0.23 - 03/07/25
- Bumped luwen 0.7.3 -> 0.7.5 to fix cargo lock compatibilty issue

## 3.0.22 - 02/07/25
- Bumped tt-tools-common 1.4.16 -> 1.4.17
- Bumped luwen 0.7.2 -> 0.7.3
- Bumped smi 3.0.21 -> 3.0.22

## 3.0.21 - 26/06/25

- Added option to not re-init chips after reset
- Updated galaxy 6u reset option from --ubb_reset to -glx_reset
- Removed the a3 arc message before doing a 6u reset, meaning we can reset even when chips are not pcie accessible
- Added eth link check and return failure if any of the eth links have a LINK_INACTIVE_FAIL_DUMMY_PACKET failure

## 3.0.20 - 04/06/25

- Chore - bumped tt-tools-common version to fix driver version check for compatability with tt-kmd 2.0.0

## 3.0.19 - 30/04/25

- Fixed an issue preventing the telemetry thread from being dispatched when the user clicked tab 2

## 3.0.18 - 22/05/25

- Added BH and WH UBB board type support
- Removed the dependency on tt-tools-common for this info

## 3.0.17 - 13/05/25

- Added proper telemetry heartbeat checks for Grayskull

## 3.0.16 - 12/05/25

- Used new ResetTypes from tools-common to simplify reset code
- Added a heartbeat spinner to the telemetry pane. We expect this spinner to update about twice per second. If the spinner is not moving, this indicates new telemetry is not being fetched.

## 3.0.15 - 24/04/25

- Patch for the ubb_reset to just discover local only post reset. Looks like eth port status 2 has been re-used to mean connected and pyluwen waits for it to clear, leading to eth timeout.

## 3.0.14 - 21/04/25

- Added wh ubb reset via command line `tt-smi --ubb_reset`. Intention is that this command line option will be removed and integrated into `tt-smi -r` after we update board detection with the correct external naming.
- Removed some unused imports and code - no functional changes

## 3.0.13 - 21/03/25

- Removed get\_sw\_versions

## 3.0.12 - 21/03/25

- Chore - bumped luwen version to include eth fw version check fix

## 3.0.11 - 13/03/25

- Chore - bumped luwen version to include enable chips with external connections but no routing

## 3.0.10 - 10/03/25

- Chore - bumped luwen version to include protoc lib detection check

## 3.0.9 - 07/03/25

- Chore - bumped luwen v
monitoring telemetry smi hardware-management
grayskull wormhole blackhole

tt-inference-server

official
Python · Apache-2.0 · 58⭐ ·

Production-ready model serving for Tenstorrent hardware with OpenAI-compatible REST API. Supports continuous batching, multiple models, and all TT hardware configurations.

LATEST v0.16.0 2026-06-12T18:21:42Z Release notes ↗
4 previous releases
v0.15.0 2026-05-29T15:55:11Z
v0.14.0 2026-05-15T22:34:02Z
v0.13.0 2026-04-24T20:21:26Z
v0.10.1 2026-04-08T09:58:17Z
See all releases on GitHub ↗
serving openai-compatible production rest-api
wormhole blackhole quietbox galaxy

ttnn-visualizer

official
TypeScript · Apache-2.0 · 52⭐ ·
ttnn-visualizer preview

Comprehensive tool for visualizing and analyzing model execution on Tenstorrent hardware. Interactive graphs, memory plots, tensor details, buffer overviews, operation flow graphs, and multi-instance support.

📦 Repo
LATEST v0.89.0 2026-06-10T18:50:24Z Release notes ↗
4 previous releases
v0.88.0 2026-06-03T20:23:29Z
v0.87.0 2026-05-27T17:30:12Z
v0.86.0 2026-05-20T18:34:19Z
v0.85.0 2026-05-13T20:31:18Z
See all releases on GitHub ↗
visualization profiling memory operations graphs
wormhole blackhole

tt-llk

official
C++ · Apache-2.0 · 52⭐ · Jun 5, 2025

Tenstorrent Low-Level Kernels: the C++ library that directly programs the RISC-V cores inside each Tensix compute engine. TRISC0 (unpack), TRISC1 (math/FPU/SFPU), and TRISC2 (pack) are all programmed through this layer — it is the interface between TT-Metal kernel code and bare silicon.

tensix risc-v llk trisc brisc ncrisc low-level compute-engine
grayskull wormhole blackhole

tt-lang

official
Python · Apache-2.0 · 51⭐ ·
tt-lang preview

Python-based DSL that sits between TT-NN and TT-Metalium — expresses custom fused kernels with progressive disclosure, compiling directly to Tensix. Ships an integrated functional simulator (no hardware needed), line-by-line performance metrics, and AI-agent-friendly tooling. Two packages: tt-lang (compiler + hardware, requires ttnn) and tt-lang-sim (simulator only, works on Linux/macOS without Tenstorrent hardware).

# Changelog

All notable changes to TT-Lang will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## Version 1.1.1

### Compiler

- Fix for live-interval boundary computation (issue [#536](../../issues/536))
- Fix for all-zero results in FP32 reductions (issue # [#533](../../issues/533))
- Fix for inferred `pop` and `push` (issues [#536](../../issues/536), [#554](../../issues/554))
- Fix for write pointer tracking on pipe sender accross iterations (issue [#578](../../issues/578))
- Fix to report data type mismatch error
- Fix to report DFB over allocation error (issue [#511](../../issues/511))
- Support for pipenet predicates `is_src`, `is_dst` and `is_active` (issue [#541](../../issues/541))
- Support for `ttl.math.typecast`

### Simulator

- Support for inferred `pop`, `push` and `copy`'s transfer handle `wait`
- Support for pipenet predicates `is_src`, `is_dst` and `is_active`
- Support `all_gather`
- Support `bfloat8_b`
- Improved/actionable error messages
- Improved performance by simulating math in FP32

### Infrastructure

- TT-Lang installable with `pip install tt-lang` for full installation and `pip install tt-lang-sim` for simulator only
- [Matmul benchmarks](benchmarks/matmul/README.md)

## Version 1.0.0

### Compiler

- Support `+=` syntax in conjunction with dot product (`@`) lowered to packer L1 accumulation
- Support implicit temporary compute-kernel-local DFBs
- Support `ttl.Pipenet`
- Support implicit `ttl.Block.push` and `ttl.Block.pop`
- Support implicit `ttl.Transfer.wait`
- Support for `expm1`, `exp2`, `ceil`, `sign`, `gelu`, `silu`, `hardsigmoid`, `square`, `softsign`, `signbit`, `frac`, `trunc` in `ttl.math`

### Simulator

- Support for `ttl.GroupTransfer`
- SPMD and mesh device simulation support
- Support for `ttnn.all_reduce` CCLs
- Use tracing to report statistics with `tt-lang-sim-stats`
- Remote L1 reads/writes statistics

### Examples and documentation
- Matmul tutorial

## Version 0.1.8

### Compiler

- Support for dot product operator (`@`) with lowering to [`ckernel::matmul_block`](https://docs.tenstorrent.com/tt-metal/v0.55.0/tt-metalium/tt_metal/apis/kernel_apis/compute/matmul_block.html)
- Support for fusing matmul and certain elementwise operations
- Support lowering to `pack_tile_block`
- Support for `ttl.math.fill`, `ttl.math.reduce_sum`, `ttl.math.reduce_max`, and `ttl.math.transpose`
- Support for arbitrary sub-blocking including dot product K-dimension to allow maximizing L1 usage and reuse
- Support for `sin`, `cos`, `tan`, `asin`, `acos`, `atan` in `ttl.math`
- Support for L1 sharded tensors
- Support for tensors with BF8 data type
- SPMD support (`ttnn.open_mesh_device`)

### Simulator

- Track L1 space and number of DFBs usage and warn when exceeded
- Support for tensors with row-major layout
- Support for L1 sharded tensors

### Examples and documentat
dsl python kernels tt-lang simulator kernel-fusion
wormhole blackhole ttsim

TT-Studio

official
TypeScript · Apache-2.0 · 48⭐ ·

Web-based GUI for deploying and chatting with AI models on Tenstorrent hardware. Handles all technical setup automatically — deploy models, run inference, and explore capabilities through a simple browser interface.

📦 Repo
LATEST v2.6.0 2026-05-20T17:04:32Z Release notes ↗
4 previous releases
v2.5.0 2026-04-20T17:03:48Z
v2.4.1 2026-03-24T15:09:57Z
v2.1.0 2025-10-04T01:33:59Z
v2.0.1 2025-07-21T19:53:40Z
See all releases on GitHub ↗
web-ui gui models chat deployment
wormhole blackhole quietbox

WallaBMC

official
C · Apache-2.0 · 46⭐ ·
WallaBMC preview

Lightweight BMC (Baseboard Management Controller) for STM32 and similar MCUs, with Web UI, Redfish API, and HTTPS support. Built on Zephyr RTOS. Used in Tenstorrent systems.

📦 Repo
bmc stm32 redfish zephyr embedded

tt-umd

official
C++ · Apache-2.0 · 43⭐ ·

User-mode driver for Tenstorrent hardware. The userspace layer that sits between the kernel module and higher-level SDKs.

📦 Repo
# Changelog

## [0.9.5] - 2026-05-12

### Changed

Hardware hang detection for NOC and PCIe.
Tracy profiler integration with instrumentation across TLB, PCIe and sysmem paths.
DeviceProtocol ported to TTDevice, including DMA migration.
SocDescriptor split into static (SocArchDescriptor) and runtime parts.
LITERAL coordinate system in CoreCoord.
Multicast to all TENSIX cores.
SMN support.
SWEmuleChip software emulation chip and Quasar simulation support (incl. 4GB TLB).
Unified UmdException/UMD_ASSERT/UMD_THROW error handling across the codebase.

## [0.9.4] - 2026-03-18

### Changed

TopologyDiscoveryOptions refactoring.
TopologyDiscoveryOption to retrain ETH links on 6u.
TLBs for TTsim.
DRAM retrain support.
DeviceProtocol changes.
Simulator in TTDevice changes.
ETH heartbeat check.

## [0.9.3] - 2026-02-24

### Changed

Sigbus safe read write API.
Remove 4U related code.
Implement BH SPI as well, so full SPI support.
P150 expects harvested cores.
TT_VISIBLE_DEVICES uses logical IDs.

## [0.9.2] - 2026-02-09

### Changed

SPI interface for Wormhole.
PCI BDF based sorting and filtering.
Multicast PCI DMA.
Support Blackhole loudbox.
Many code fixes and test enhancements.

## [0.9.1] - 2026-01-23

### Changed

Started publishing to pypi.

## [0.9.0] - 2026-01-23

### Changed

Warm reset notification and callback implementation.

## [0.8.6] - 2026-01-20

### Changed

Make predicting ETH FW from CMFW optional in TopologyDiscovery.

## [0.8.4] - 2026-01-16

### Changed

Use older manylinux image

## [0.8.3] - 2026-01-15

### Changed

Reverted remote discovery issue

## [0.8.2] - 2026-01-15

### Changed

Support warm reset without secondary bus reset.
Expose subsystem vendor id.

## [0.8.1] - 2026-01-15

### Changed

Support dma functions on TTDevice layer

## [0.8.0] - 2026-01-14

### Changed

Many functional fixes and minor changes.
Final fixes needed for integration into tt-smi.
Also contains adjustments needed for integration into exalens.

## [0.7.0] - 2025-11-29

### Changed

Changed to a more generic arc_msg API.

## [0.6.0] - 2025-11-24

### Changed

Change the usage of TLBs such that KMD is in control of TLB allocation instead of UMD.
TLBs are now allocated using KMD's dedicated API.

## [0.5.3] - 2025-11-14

### Changed

Added generation of .deb and .rpm packages.
Added three separate packages (runtime, development and python).

## [0.5.1] - 2025-11-12

### Changed

Manylinux builds and Pypi test publishing.
Many smaller fixes and improvements.

## [0.4.0] - 2025-10-18

### Changed

Removed old type names.

## [0.3.0] - 2025-10-17

### Changed

Many smaller fixes and improvements.
TTsim support improvements.
JTAG support improvement.
Fixing CMake install path.
Further work on integrating new KMD TLBs.

## [0.2.0] - 2025-09-15

### Changed

A couple of smaller fixes and improvements, including L2CPU harvesting, fixes for new FW. Better TTSim support. Further JTAG support.
Introduced new soft reset API.
Introduced lite fabric initial version.
user-mode-driver umd hardware-interface
grayskull wormhole blackhole

tt-system-firmware

official
C · Apache-2.0 · 39⭐ ·
tt-system-firmware preview

System firmware for Tenstorrent hardware. Low-level system initialization and control firmware that runs on-device.

firmware system embedded
wormhole blackhole

luwen

official
Rust · Apache-2.0 · 34⭐ ·

Tenstorrent system interface library written in Rust. Low-level Rust bindings for communicating with and managing TT hardware.

📦 Repo
LATEST v0.8.5 2026-03-30T21:03:56Z Release notes ↗
4 previous releases
v0.8.4 2026-03-26T19:34:59Z
v0.8.3 2026-03-26T16:02:34Z
v0.8.2 2026-03-23T18:58:20Z
v0.8.1 2025-12-17T21:16:21Z
See all releases on GitHub ↗
rust system-interface low-level bindings
grayskull wormhole blackhole

tt-tvm

official
Python · Apache-2.0 · 31⭐ ·

TVM for Tenstorrent ASICs. Brings the Apache TVM compiler stack to Tenstorrent hardware, enabling model compilation from TensorFlow, PyTorch, ONNX, and more.

📦 Repo
tvm compiler tensorflow onnx
grayskull wormhole blackhole

tensix-isa-simulator

official
C++ · Apache-2.0 · 29⭐ ·

ISA-level simulator for the Tensix compute engine. Simulates the matrix, vector, and scalar units inside each Tensix core.

📦 Repo
tensix isa simulator compute-engine
ttsim

tt-torch

official
Python · Apache-2.0 · 25⭐ ·

Frontend integration for PyTorch with tt-mlir. Compile PyTorch models directly to Tenstorrent hardware via torch.compile integration.

LATEST 0.4.0 2025-09-29T22:23:47Z Release notes ↗
5 previous releases
0.5.0.dev20251008pre 2025-10-08T05:36:07Z
0.5.0.dev20251007pre 2025-10-07T04:22:29Z
0.5.0.dev20251006pre 2025-10-06T04:21:23Z
0.5.0.dev20251005pre 2025-10-05T04:38:19Z
0.5.0.dev20251004pre 2025-10-04T04:22:15Z
See all releases on GitHub ↗
pytorch torch-compile frontend
wormhole blackhole

tt-firmware

official
Apache-2.0 · 24⭐ ·

Tenstorrent firmware repository. Board management and control firmware for Tenstorrent accelerator cards.

📦 Repo
LATEST v19.6.0 2026-02-20T16:53:34Z Release notes ↗
4 previous releases
v19.5.0 2026-02-04T18:22:15Z
v19.4.2 2026-01-05T23:32:14Z
v19.4.1 2025-12-19T17:06:37Z
v19.4.0 2025-12-16T05:38:23Z
See all releases on GitHub ↗
firmware bmc board-management
wormhole blackhole

tt-installer

official
Shell · Apache-2.0 · 23⭐ ·

Install the complete Tenstorrent software stack with one command. Handles drivers, firmware, Python environment, and SDK setup automatically.

LATEST v2.2.1 2026-03-16T18:54:29Z Release notes ↗
4 previous releases
v2.2.0 2026-03-10T19:52:29Z
v2.1.0 2026-01-14T19:34:46Z
v2.0.0 2025-12-05T20:38:41Z
v1.11.0 2025-12-02T20:02:43Z
See all releases on GitHub ↗
installation setup one-command getting-started
wormhole blackhole

tt-exalens

official
Python · Apache-2.0 · 21⭐ ·

Low-level hardware debugger for Tenstorrent devices. Inspect register state, memory contents, and kernel execution at the hardware level.

📦 Repo
debugger low-level hardware registers
wormhole blackhole

tt-topology

official
Python · Apache-2.0 · 16⭐ ·
tt-topology preview

Configure Ethernet routing on multi-card Tenstorrent systems. Flash NB cards to use specific ETH routing configurations for scale-out deployments.

📦 Repo
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## 1.2.11 - 17/06/2025

### Updated

- Updated mesh coord generation to be connection type agnostic
- Added failure and exit if mesh type detected, but not enough connections
- Added warning in README about lack of supoort for BH and 6U boards

## 1.2.10 - 05/06/2025

### Updated

- Bumped tt-tools-common version to fix driver version check for compatability with tt-kmd 2.0.0

## 1.2.9 - 30/05/2025

### Updated

- Bug fix for https://github.com/tenstorrent/tt-topology/issues/39. Now the tool will use a DFS longest path to determine a linear layout if its not a fully connected graph.
- Updated initial device detection - now it needs full noc access for octopus and list options

## 1.2.8 - 08/05/2025

### Updated

- Fixed issue where tool would fail when PCI interfaces don't start from ID 0
- Now using actual PCI interface IDs from devices instead of assuming sequential numbering

## 1.2.7 - 07/05/2025

### Updated

- Use tools-common 1.4.15
- Use type checking in octopus reset

## 1.2.6 - 05/05/2025

### Updated

- Bug fix: added "ignore-eth" flag to first chip detect to avoid eth training loops forever and truly detect pcie only chips
- Chore: bumped luwen

## 1.2.5 - 15/04/2025

### Updated

- When flashing to isolated mode, we now flash the WH ethernet ports to a disabled state,
  in order to prevent their use.

## 1.2.4 - 02/04/2025

### Updated

- You can now run `tt-topology -l isolated` to flash cards to the default (non-connected) state
- Users are now warned about missing or loose cables

## 1.2.3 - 21/03/2025

### Fixed

- Bumped luwen (0.6.2 -> 0.6.3) to include eth version check bug for TG setup

## 1.2.2 - 13/03/2025

### Fixed

- Bumped luwen version to make it more robust against eth fw updates

## 1.2.1 - 13/03/2025

### Fixed

- Moved the spi reads after the reset to increase stability during M3 L2R copy
- Bumped luwen version

## 1.2.0 - 06/03/2025

### Fixed

- Updated how local eth board info is calculated to make it agnostic to eth fw version
- bumped tt-tools-common version
- Added traceback printing when catching exceptions in main.

## 1.1.5 - 14/05/2024

### Updated

- Bumped luwen (0.3.8) and tt_tools_common (1.4.3) lib versions
- Removed unused python libraries

## 1.1.4 - 25/03/2024

### Fixed
- Changed detect_chips with detect_chips_with_callback to enable detailed debug info.

## 1.1.3 - 22/03/2024

### Fixed
- Bumped tt-tools-common version to avoid pip discrepancy.

## 1.1.2 - 22/03/2024

### Fixed
- Fixed command line bug when no args are provided.

## 1.1.1 - 21/03/2024

### Fixed
- Fixed reference to pyluwen lib

## 1.1.0 - 12/03/2024

### Added
- Octopus Configuration (4 n150s connected to 1 galaxy)


## 1.0.2 - 12/03/2024

### Fixed
- Dependency bug with tt_tools
topology ethernet multi-card routing
wormhole blackhole

tt-npe

official
C++ · Apache-2.0 · 14⭐ ·
tt-npe preview

Network-on-chip Performance Estimator for Tenstorrent Tensix-based devices. Model and estimate NoC utilization before running kernels on hardware.

📦 Repo
noc performance estimator profiling
wormhole blackhole

tt-blacksmith

official
Python · Apache-2.0 · 13⭐ ·

Optimized training recipes for a variety of ML models on Tenstorrent hardware, powered by the TT-Forge compiler stack. Reference implementations for fine-tuning and training from scratch.

training fine-tuning recipes pytorch
wormhole blackhole

tt-example-apps

official
Jupyter Notebook · Apache-2.0 · 13⭐ ·

End-to-end AI applications running on Tenstorrent AI accelerators. Complete application examples from retrieval-augmented generation to image generation pipelines.

📦 Repo
rag applications end-to-end examples
wormhole blackhole

tt-flash

official
Python · Apache-2.0 · 13⭐ ·

Tenstorrent firmware update utility. Flash new firmware onto Tenstorrent accelerator cards from the command line.

📦 Repo
LATEST v3.8.0 2026-06-01T18:04:27Z Release notes ↗
4 previous releases
v3.7.0 2026-05-15T19:32:29Z
v3.6.5 2026-04-16T19:43:11Z
v3.6.4 2026-04-10T14:44:44Z
v3.6.3 2026-04-08T15:38:59Z
See all releases on GitHub ↗
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## 3.4.0 - 30/07/25

- Bump pyyaml 6.0.1 -> 6.0.2
- Improve error message formatting
- No longer have to use --force for flashing BH cards

## 3.3.5 - 03/07/25

- Bump luwen 0.7.3 -> 0.7.5

## 3.3.4 - 02/07/25

- Bump tt-tools-common 1.4.16 -> 1.4.17
- Bump luwen 0.6.4 -> 0.7.3

## 3.3.3 - 05/06/2025

- Bumped tt-tools-common version to fix driver version check for compatability with tt-kmd 2.0.0

## 3.3.2 - 14/05/2025

- Bump tt-tools-common version to latest

## 3.2.0 - 12/03/2025

### Updated

- luwen version bump to bring inline with tt-smi; provides stability fixes

## 3.1.3 - 06/03/2025

### Added

- luwen version bump to include bh arc init checks

## 3.1.2 - 28/02/2025

### Added

- Support for more BH cards: p100a, p150, and p150c

## 3.1.1 - 06/01/2025

### Updated

- Bumped luwen version to accomodate Maturin updates

## 3.1.0 - 29/10/2024

### Added

- Support for flashing the BH tt-boot-fs file format
- Bumped luwen version to 0.4.6 to allow resets when chip is inaccessible

## 3.0.2 - 17/10/2024

### Fixed
- Unbound variable when exception is thrown when getting current fw-version

## 3.0.1 - 16/10/2024

### Changed
- Bumped luwen version to 0.4.5 to resolve false positives on bad chip detection

## 3.0.0 - 23/08/2024

- NO BREAKING CHANGES! Major version bump to signify new generation of product.
- Added support for p100

## 2.2.0 - 19/07/2024

### Updated
- Added support for an alternative spi flash configuration via a new version of luwen

## 2.0.8 - 14/05/2024

### Updated
- Bumped luwen (0.3.8) and tt_tools_common (1.4.3) lib versions

## 2.0.1 - 2.0.7
- Dependency updates

## 2.0.0
- WH flash release

## 1.0.0

- GS flash release
firmware-update flash utility
grayskull wormhole blackhole

tt-vscode-toolkit

official
TypeScript · Apache-2.0 · 7⭐ · Dec 18, 2025
tt-vscode-toolkit preview

48 interactive lessons covering the full Tenstorrent developer path — from hardware detection to custom training — with click-to-run commands and hardware auto-detection. Available in VSCode and code-server.

LATEST v0.0.465 2026-06-09T20:32:59Z Release notes ↗
4 previous releases
v0.0.454 2026-06-05T18:44:37Z
v0.0.453 2026-05-29T17:21:23Z
v0.0.447 2026-05-18T22:27:56Z
v0.0.438 2026-05-11T16:43:12Z
See all releases on GitHub ↗
# Changelog

All notable changes to the TT-VSCode-Toolkit will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

---

## [0.0.503] - 2026-06-12
### Fixed
- **Self-review fixes** — 29 issues confirmed by automated adversarial review: corrected MESH_DEVICE enum values in code-fence comments (T3000→T3K, n150→N150, Galaxy→GALAXY throughout vllm-production, image-generation, bounty-program, version-compatibility, step-zero, README); removed `<sup>™/</sup>` HTML injected into yaml/bash code fences (ct3-configuration-patterns, step-zero); fixed api-server print string back to `tt-metal ready`; corrected `p100`→`P100` in hardware-detection and QB_follows prose; completed T3K→T3000 normalization in ttsim/cookbook-particle-life prose callouts; fixed FAQ duplicate stale QB2 paragraph and TTNN/tt-metal prose table cells; fixed bare `tt-metal` prose in image-generation (→ TT-Metalium); updated link display text in cookbook-overview and tt-inference-server (github URL→ product name); clarified Vale config comments (ProductNames.yml T3000 exception, Terminology.yml link-text caveat).

## [0.0.502] - 2026-06-11
### Fixed
- **QB2 → TT-QuietBox 2 in llms.txt** — the LLM context file (consumed by the content website) had 11 prose `QB2` references; all replaced with `TT-QuietBox 2`; URL slugs (`qb2-*`) left untouched.

## [0.0.501] - 2026-06-11
### Fixed
- **QB2 → TT-QuietBox 2 prose normalization** — replaced all `QB2` shorthand in prose with the full `TT-QuietBox 2` product name across `ttsim-twenty-and-ten.md`, `cookbook-particle-life.md`, and `FAQ.md`; lesson title slugs (`qb2-*`) and command IDs left untouched.

## [0.0.500] - 2026-06-11

### Changed

- **Version bump** — increment to 0.0.500 after merging copyedit branch with origin/main; consolidates copyedit normalization (hardware IDs, TT-Metalium™/TT-NN™ trademarks, TT-QuietBox naming) with main's ttsim, AnimateDiff Phase 2.5, and mobile improvements.

---

## [0.0.477] - 2026-05-27

### Changed

- **Prose copyedit pass** — fixed TT-Forge<sup>™</sup> trademark placement in `tt-xla-jax.md`; updated `STYLE_GUIDE.md` hardware casing rules (`n150`/`n300`/`T3000`/`p300c`, capitalized `Galaxy`); normalized hardware IDs and `TTNN`→`TT-NN` in prose and sample output; renamed `TT Metal`→`TT-Metalium` in `tt-inference-server.md`. Extended `normalize-hardware-copy.js` and `normalize-ttnn-copy.js`; added `normalize-tt-metal-copy.js`. Polished `STYLE_GUIDE.md` trademark examples; fixed `normalize-open-source-copy.js` to skip inline code; added `plans/vscode-toolkit-copyedit-pr.md` PR summary.

---

## [0.0.476] - 2026-05-27

### Changed

- **TT-Metalium<sup>™</sup> and TT-NN<sup>™</sup> trademarks** — first prose mention per page now uses `TT-Metalium` and `TT-NN` (trademark, not registered). Updated `scripts/add-tt-product-trademarks.js` and `STYLE_GUIDE.md`; migrated pri
vscode lessons interactive getting-started code-server
wormhole blackhole quietbox ttsim

tt-toplike

official
Rust · Apache-2.0 · 2⭐ ·
tt-toplike preview

A vibrant htop-style visualizer for Tenstorrent hardware written in Rust. Real-time process and utilization view for TT accelerators.

LATEST v0.6.2 2026-06-08T19:42:59Z Release notes ↗
4 previous releases
v0.6.1 2026-06-02T23:46:12Z
v0.6.0 2026-05-26T23:49:21Z
v0.5.0 2026-04-29T17:39:43Z
v0.4.3 2026-04-25T20:02:37Z
See all releases on GitHub ↗
monitoring htop rust real-time
wormhole blackhole

tt-local-generator

official
Python · Apache-2.0 · 1⭐ ·
tt-local-generator preview

Generate infinite videos and images (and imaginative prompts to inspire them) on Tenstorrent's Quietbox 2. Fully local generative media pipeline.

video-generation image-generation quietbox generative
quietbox

tt-animatediff

official
Python · Apache-2.0 ·
tt-animatediff preview

Generates short, temporally coherent animated GIFs using the AnimateDiff model on Tenstorrent hardware. Phase 1 runs the correct SD 1.4 + MotionAdapter architecture on CPU; Phase 2 accelerates spatial denoising on Blackhole using the TTNN UNet. Produces vibrant 8-frame animations in ~15 s/frame on a P300C.

LATEST v0.6.0 2026-06-10T22:16:43Z Release notes ↗
1 previous release
v0.1.0 2026-06-04T22:31:14Z
See all releases on GitHub ↗
animatediff video-generation stable-diffusion diffusion gif blackhole
blackhole