Community · Open Source · Tenstorrent Ecosystem

A hidden dimension of Tenstorrent awesomeness

A curated directory of projects, tools, models, and research for Tenstorrent hardware — contributed by the community and our team. Browse by category or search across all entries.

106 Projects
12 Categories
Open Source
🚀 Getting Started
The essential first steps — installer, core SDKs, and guided onboarding
tt-metal
TT-NN operator library and TT-Metalium low-level kernel programming model. The primary SDK for devel…
🤖 AI & Models
Running, serving, and experimenting with AI models
tt-boltz
Boltz-2 biomolecular model for drug discovery on Tenstorrent Blackhole. Supports single-card and mul…
🕵️ AI Agents
Agentic systems and AI assistants running on TT hardware
dstack
Vendor-agnostic orchestration for training, inference, and agentic workloads across NVIDIA, AMD, TPU…
⚙️ Custom Kernels & Low-Level
Metalium/tt-lang kernel authoring; anything sub-compiler
tt-tiny
Minimal Python code to access and program the Tenstorrent Blackhole chip directly — George Hotz's ex…
🔨 Compilers & Frontends
Getting PyTorch/JAX/ONNX/CUDA models onto TT hardware
BarraCUDA
Open-source CUDA compiler targeting multiple GPU architectures including Tenstorrent. Compiles .cu f…
🛠 Dev Tools & Debugging
Profiling, visualization, and debugging workloads
nvtop
htop-style process monitor for GPUs and AI accelerators. Supports AMD, Apple, Huawei, Intel, NVIDIA,…
🖥 Hardware & System
Drivers, firmware, monitoring, and hardware management
tt-kmd
Tenstorrent kernel module driver. The Linux kernel module required to interface with Tenstorrent PCI…
☁️ Cloud & Orchestration
Kubernetes, cloud deployment, and multi-node infrastructure
TT Console
Browser-based cloud console for exploring AI on Tenstorrent hardware. Run LLM inference, image and v…
🔩 RISC-V & Architecture
ISA, simulation, and running Linux on TT silicon
tt-bh-linux
Linux demo for the Tenstorrent Blackhole P100/P150 card RISC-V cores. Boot a real Linux kernel on th…
🔬 Research & Papers
Academic papers, theses, and HPC experiments
tt-tutorial (HPC)
Tutorial on Tenstorrent hardware for HPC researchers from the RISC-V Testbed project at Edinburgh/EP…
🎮 Games & Demos
Creative, playful, and proof-of-concept projects
tt-zork-and-more
A Tenstorrent fork of Infocom's Zork I (and more!), running a Z-machine interpreter at least four di…
📚 Guides, Tutorials & Education
Getting-started content, blog posts, lessons, courses
Programming Tenstorrent Processors
Deep-dive into the Tenstorrent architecture and Metalium programming model — circular buffers, kerne…
🚀 Getting Started
nvtop community10722⭐
htop-style process monitor for GPUs and AI accelerators. Supports AMD, Apple, Huawei, Inte…
dstack community2155⭐
Vendor-agnostic orchestration for training, inference, and agentic workloads across NVIDIA…
BarraCUDA community1695⭐
Open-source CUDA compiler targeting multiple GPU architectures including Tenstorrent. Comp…
tt-tiny community65⭐
Minimal Python code to access and program the Tenstorrent Blackhole chip directly — George…
tt-sim community13⭐
Community-built Tenstorrent architecture simulator written in Python. Runs without hardwar…
tt-iree community12⭐
IREE (Intermediate Representation Execution Environment) ML compiler ported to Tenstorrent…
triton-tenstorrent community11⭐
OpenAI Triton compiler plugin for Tenstorrent hardware. Write Triton kernels and target Te…
bhx community4⭐
Boot stock Linux cloud images on the SiFive X280 RISC-V cores inside Tenstorrent Blackhole…
tt-boltz community
Boltz-2 biomolecular model for drug discovery on Tenstorrent Blackhole. Supports single-ca…
· Jan 31, 2026
Programming Tenstorrent Processors community
Deep-dive into the Tenstorrent architecture and Metalium programming model — circular buff…
· Apr 21, 2025
Tenstorrent SFPU Kernel Series — Jason Davies community
Sponsored series of deep technical articles on implementing optimal SFPU kernels for the T…
· Nov 12, 2025
Tenstorrent Blackhole Architecture Guide community
A 6,500-word community deep dive into the Blackhole p100a architecture: the tile model (Te…
· Feb 28, 2026
grayskull-attention community38⭐
FlashAttention-style attention kernel implemented entirely in on-chip SRAM on the Tenstorr…
tt-twitch community28⭐
A Tenstorrent Grayskull kernel written live on Twitch by George Hotz. 120-core grid demons…
koyeb/tenstorrent-examples community18⭐
Example applications and deployment configurations for running AI workloads on Tenstorrent…
blackhole-py community14⭐
Pure Python driver for Tenstorrent Blackhole cards providing direct low-level hardware acc…
tenstorrent-tiny-examples community14⭐
Simple C++ kernel experiments on a GraySkull e75 chip. Hands-on examples for learning the …
ttnn-helloworld-cpp community14⭐
Minimal working example of using Tenstorrent TTNN in C++. The simplest possible starting p…
TT-GoL community12⭐
Conway's Game of Life implemented on Tenstorrent hardware using TT-Metal kernels.
ttMandelbrot community7⭐
Mandelbrot Set fractal renderer running on Tenstorrent hardware. A classic demo showcasing…
TT-Metal Mini Template community7⭐
Minimal working CMake project template for starting a new TT-Metal project from scratch. G…
tt-tutorial (HPC) community7⭐
Tutorial on Tenstorrent hardware for HPC researchers from the RISC-V Testbed project at Ed…
ttPEAK community6⭐
clpeak-style peak-performance benchmark for Tenstorrent devices using TT-Metalium. Measure…
tenstorrent.nix community6⭐
Nix flake packaging the Tenstorrent software stack for NixOS and Nix users. Reproducible, …
current community5⭐
High-level parallel programming framework for Tenstorrent accelerators, abstracting TT-Met…
ttVecAdd community5⭐
Minimal vector-addition example on Tenstorrent devices using TT-Metalium. A clean hello-wo…
ttas community4⭐
ttas is a hacker-friendly assembler/disassembler for Tensix on Wormhole. It turns assembly…
tt-tutorial (Korean) community4⭐
Comprehensive tutorials for the Tenstorrent software stack in Korean. Jupyter notebooks co…
Collective Operations on Wormhole n150 (Sapienza University of Rome) community4⭐
Master's thesis implementing and benchmarking five allreduce algorithms (Swing, Recursive …
libtt-metal-cxx community2⭐
Rust crate that exposes the TT-Metal host API through a C++ bridge via cxx.rs — covering d…
gsplat_tt community
Port of Gaussian Splatting (3D scene reconstruction from 2D images) to Tenstorrent hardwar…
A Gentle Guide: Tenstorrent Card on Arch Linux with Metalium community
Step-by-step guide to getting a Tenstorrent card running on Arch Linux with the full Metal…
· Jul 7, 2024
Thoughts and Logs After Messing with Tenstorrent Grayskull community
Honest field notes from getting a Grayskull card running and writing first Metalium kernel…
· Jun 2, 2024
Tenstorrent Architecture — W&M CSCI654 Advanced Computer Architecture community
Lecture 20 from William & Mary's graduate Computer Architecture course. Frames Tenstorrent…
· Oct 9, 2024
Attention in SRAM on Tenstorrent Grayskull community
A fused kernel for the Grayskull architecture implementing Transformer self-attention enti…
· Jul 18, 2024
Exploring Fast Fourier Transforms on the Tenstorrent Wormhole community
Ports the Cooley-Tukey FFT algorithm to the Wormhole n300 RISC-V accelerator. The Wormhole…
· Jun 18, 2025
Assessing Tenstorrent Grayskull RISC-V MatMul Acceleration for LLMs community
Evaluates the Tenstorrent Grayskull e75 RISC-V accelerator for matrix multiplication at re…
· May 9, 2025
Porting Strategies for Gravitational N-Body Simulations on Tenstorrent Wormhole community
Evaluates three strategies for scaling an N-body code across multiple Tenstorrent Wormhole…
· May 4, 2026
Accelerating Gravitational N-Body Simulations on Tenstorrent Wormhole community
Accelerates an astrophysical N-body simulation on the Wormhole n300. Achieves 2× speedup a…
Nov 16, 2025
Numerical Kernels on a Spatial Accelerator: Tenstorrent Wormhole community
Implements three numerical kernels and composes them into a conjugate gradient solver on W…
Mar 24, 2026
Accelerating Stencils on the Tenstorrent Grayskull RISC-V Accelerator community
Explores stencil computation on the Grayskull PCIe RISC-V accelerator. Early academic work…
Sep 27, 2024
Stencil Computations on Tenstorrent Wormhole community
Maps 2D 5-point stencil computations onto the Tenstorrent Wormhole RISC-V AI dataflow acce…
May 8, 2026
SwiftNPU: Scalable Shape-Flexible Allocation for Inter-Core Connected NPUs community
Makes multi-tenant NPU sharing practical for Blackhole-class hardware using polynomial-tim…
Apr 27, 2026
TileLoom: Automatic Dataflow Planning for Spatial Dataflow Accelerators community
Compiler system that automatically generates efficient dataflow plans for tile-based langu…
· Dec 17, 2025
Rewriting TTS Inference Economics: Lightning V2 on Tenstorrent vs. NVIDIA L40S community
Shows that Text-to-Speech inference on Tenstorrent Lightning V2 achieves 4× lower cost tha…
· Mar 24, 2026
tt-zork-and-more affiliated2⭐
A Tenstorrent fork of Infocom's Zork I (and more!), running a Z-machine interpreter at lea…
Local AI Agents on Tenstorrent affiliated
Three agentic projects running fully on-device: local AI agents on QuietBox 2, a coding as…
Video Generation on Tenstorrent affiliated
Three lesson-projects covering on-device video synthesis: frame-by-frame diffusion with tt…
tensix-viz affiliated
Hardware topology visualizer for Tenstorrent chips — from individual chip to full cluster.…
Tenstorrent Cookbook: Particle Life Simulator affiliated
Particle Life simulation on Tenstorrent hardware — an emergent-behavior N-body system wher…
CS Fundamentals on Tenstorrent Hardware affiliated
Seven-module computer science curriculum taught on real Tenstorrent hardware. Covers RISC-…
tt-lang-models affiliated7⭐
A growing collection of models that use tt-lang for some or all of their implementation. R…
tt-qb-lights affiliated2⭐
Sync your Tenstorrent Quietbox's RGB lighting to accelerator utilization status. Visual fe…
gemma4 affiliated1⭐
Gemma 4 language model implemented in tt-lang (e4b variant) for direct execution on Tensto…
open-oasis affiliated1⭐
tt-lang inference script for Oasis 500M — an interactive video world model running on Tens…
tt-model-runner affiliated1⭐
Discover, load, and benchmark models with a GUI and TUI for tt-inference-server. Makes exp…
tt-claw affiliated
A Tenstorrent-powered claw machine that rewards players with real prizes. The QuietBox 2 r…
dflash affiliated
DFlash: Block Diffusion for Flash Speculative Decoding on Tenstorrent hardware using tt-la…
diamond affiliated
DIAMOND: Atari game-playing agent implemented on Tenstorrent hardware via tt-lang. Diffusi…
Engram affiliated
A Tenstorrent port of the DeepSeek Engram model using tt-lang. Brings DeepSeek's memory-ef…
Stable Diffusion XL on Tenstorrent affiliated
On-device image generation with Stable Diffusion XL running entirely on Tenstorrent hardwa…
tt-forge-compiletron affiliated
Compile more than 100 models on tt-forge in a display format suitable for demos. Comprehen…
Image Classification with TT-Forge affiliated
End-to-end image classification project using TT-Forge — compile and run a PyTorch classif…
tt-warp affiliated
Warp terminal plugin for Tenstorrent — integrates hardware status, model management, and d…
Tensix Grid Playground affiliated
Interactive browser-based visualizer of the Tenstorrent Tensix grid architecture. Explore …
Tenstorrent Cookbook: Conway's Game of Life affiliated
TT-Metalium implementation of Conway's Game of Life as a cookbook recipe. Each generation …
Custom Model Training on Tenstorrent affiliated
Eight-lesson series covering the full custom training workflow on TT hardware: dataset fun…
Tenstorrent Cookbook: Core Recipes affiliated
Three hands-on TT-Metalium kernel recipes: a Mandelbrot fractal explorer, real-time audio …
tt-bh-linux official53⭐
Linux demo for the Tenstorrent Blackhole P100/P150 card RISC-V cores. Boot a real Linux ke…
TT Console official
Browser-based cloud console for exploring AI on Tenstorrent hardware. Run LLM inference, i…
tt-metal official1505⭐
TT-NN operator library and TT-Metalium low-level kernel programming model. The primary SDK…
tt-buda official314⭐
TT-BUDA: Tenstorrent's original Python compiler and runtime for AI workloads. Legacy stack…
tt-mlir official278⭐
Tenstorrent MLIR compiler — the core compiler infrastructure shared by tt-forge and other …
tt-forge official268⭐
Tenstorrent's MLIR-based compiler frontend. Enables running AI workloads from PyTorch, ONN…
riscv-ocelot official253⭐
The Berkeley Out-of-Order Machine with V-EXT (RISC-V Vector Extension) support. Tenstorren…
ttsim official119⭐
Fast full-system simulator of Tenstorrent Wormhole and Blackhole hardware. Runs TT-Metaliu…
whisper official88⭐
RISC-V Instruction Set Simulator (ISS) used by Tenstorrent for processor verification. Pow…
tt-xla official67⭐
PJRT device plugin for Tenstorrent hardware. Enables JAX, PyTorch/XLA, and other XLA-based…
tt-kmd official65⭐
Tenstorrent kernel module driver. The Linux kernel module required to interface with Tenst…
RiESCUE official65⭐
RISC-V Directed Test Framework and Compliance Suite. Comprehensive test infrastructure for…
tt-buda-demos official64⭐
Repository of model demos using TT-Buda. The largest collection of pre-compiled model exam…
tt-smi official61⭐
Tenstorrent System Management Interface — monitor device telemetry, issue board-level rese…
tt-inference-server official57⭐
Production-ready model serving for Tenstorrent hardware with OpenAI-compatible REST API. S…
ttnn-visualizer official52⭐
Comprehensive tool for visualizing and analyzing model execution on Tenstorrent hardware. …
tt-lang official51⭐
Python-based DSL that sits between TT-NN and TT-Metalium — expresses custom fused kernels …
tt-llk official51⭐
Tenstorrent Low-Level Kernels: the C++ library that directly programs the RISC-V cores ins…
Jun 5, 2025
TT-Studio official48⭐
Web-based GUI for deploying and chatting with AI models on Tenstorrent hardware. Handles a…
WallaBMC official46⭐
Lightweight BMC (Baseboard Management Controller) for STM32 and similar MCUs, with Web UI,…
tt-umd official42⭐
User-mode driver for Tenstorrent hardware. The userspace layer that sits between the kerne…
tt-system-firmware official39⭐
System firmware for Tenstorrent hardware. Low-level system initialization and control firm…
luwen official34⭐
Tenstorrent system interface library written in Rust. Low-level Rust bindings for communic…
tt-tvm official31⭐
TVM for Tenstorrent ASICs. Brings the Apache TVM compiler stack to Tenstorrent hardware, e…
tensix-isa-simulator official29⭐
ISA-level simulator for the Tensix compute engine. Simulates the matrix, vector, and scala…
tt-torch official25⭐
Frontend integration for PyTorch with tt-mlir. Compile PyTorch models directly to Tenstorr…
tt-firmware official24⭐
Tenstorrent firmware repository. Board management and control firmware for Tenstorrent acc…
tt-installer official23⭐
Install the complete Tenstorrent software stack with one command. Handles drivers, firmwar…
tt-exalens official21⭐
Low-level hardware debugger for Tenstorrent devices. Inspect register state, memory conten…
tt-topology official16⭐
Configure Ethernet routing on multi-card Tenstorrent systems. Flash NB cards to use specif…
tt-npe official14⭐
Network-on-chip Performance Estimator for Tenstorrent Tensix-based devices. Model and esti…
tt-blacksmith official13⭐
Optimized training recipes for a variety of ML models on Tenstorrent hardware, powered by …
tt-example-apps official13⭐
End-to-end AI applications running on Tenstorrent AI accelerators. Complete application ex…
tt-flash official13⭐
Tenstorrent firmware update utility. Flash new firmware onto Tenstorrent accelerator cards…
tt-vscode-toolkit official7⭐
48 interactive lessons covering the full Tenstorrent developer path — from hardware detect…
Dec 18, 2025
tt-toplike official2⭐
A vibrant htop-style visualizer for Tenstorrent hardware written in Rust. Real-time proces…
tt-local-generator official1⭐
Generate infinite videos and images (and imaginative prompts to inspire them) on Tenstorre…
tt-animatediff official
Generates short, temporally coherent animated GIFs using the AnimateDiff model on Tenstorr…
🏷 Recent Releases
28 releases
ttsim official v1.8.0
2026-06-09T17:23:20Z
tt-kmd official ttkmd-2.9.0
2026-06-09T13:25:19Z
tt-metal official v0.72.0
2026-06-09T01:30:48Z
tt-forge official 1.3.0.dev20260609002802
2026-06-09T01:16:05Z
tt-xla official 1.3.0.dev20260609002802
2026-06-09T01:07:31Z
tt-toplike official v0.6.2
2026-06-08T19:42:59Z
tt-vscode-toolkit official v0.0.454
2026-06-05T18:44:37Z
tt-animatediff official v0.1.0
2026-06-04T22:31:14Z
dstack community 0.20.23
2026-06-04T10:20:34Z
ttnn-visualizer official v0.88.0
2026-06-03T20:23:29Z
tt-flash official v3.8.0
2026-06-01T18:04:27Z
tt-system-firmware official v19.10.0
2026-06-01T13:21:59Z
tt-inference-server official v0.15.0
2026-05-29T15:55:11Z
BarraCUDA community v0.5.0
2026-05-29T04:30:28Z
ttas community v0.1.0
2026-05-28T07:08:35Z
tt-local-generator official v0.3.4
2026-05-26T23:52:15Z
TT-Studio official v2.6.0
2026-05-20T17:04:32Z
tt-smi official v5.2.0
2026-05-14T17:26:26Z
whisper official 1.861
2026-05-11T15:44:36Z
tt-sim community v1.0
2026-05-11T13:07:42Z
tt-bh-linux official v0.11
2026-04-13T15:10:59Z
luwen official v0.8.5
2026-03-30T21:03:56Z
tt-installer official v2.2.1
2026-03-16T18:54:29Z
tt-topology official v1.2.19
2026-02-26T21:14:41Z
tt-firmware official v19.6.0
2026-02-20T16:53:34Z
nvtop community 3.3.2
2026-02-08T17:57:16Z
RiESCUE official v1.7.0
2025-12-03T19:29:44Z
tt-buda official v0.19.3
2024-09-24T21:01:08Z

Select an entry to see details

nvtop

community★ featured
by Syllo · C · GPL-3.0 · 10722⭐ ·
nvtop preview

htop-style process monitor for GPUs and AI accelerators. Supports AMD, Apple, Huawei, Intel, NVIDIA, Qualcomm — and Tenstorrent. Real-time utilization, memory, and process info in a terminal UI.

📦 Repo
LATEST 3.3.2 2026-02-08T17:57:16Z Release notes ↗
3.3.1 2026-01-18T13:12:34Z
3.3.0 2026-01-16T13:28:09Z
3.2.0 2025-03-29T11:26:44Z
3.1.0 2024-02-23T15:04:44Z
See all releases on GitHub ↗
monitoring tui htop process-monitor terminal
wormhole blackhole

dstack

community★ featured
by dstackai · Python · MPL-2.0 · 2155⭐ ·
dstack preview

Vendor-agnostic orchestration for training, inference, and agentic workloads across NVIDIA, AMD, TPU, and Tenstorrent on clouds, Kubernetes, and bare metal.

LATEST 0.20.23 2026-06-04T10:20:34Z Release notes ↗
0.20.22 2026-05-28T10:24:19Z
0.20.21 2026-05-21T12:43:40Z
0.20.20 2026-05-15T11:45:23Z
0.20.19 2026-04-30T11:01:36Z
See all releases on GitHub ↗
orchestration kubernetes cloud multi-vendor

BarraCUDA

community★ featured
by Zaneham · C · 1695⭐ ·

Open-source CUDA compiler targeting multiple GPU architectures including Tenstorrent. Compiles .cu files to run on AMD and Tenstorrent hardware without modification.

📦 Repo
LATEST v0.5.0 2026-05-29T04:30:28Z Release notes ↗
See all releases on GitHub ↗
cuda compiler cross-platform blackhole
blackhole

tt-tiny

community★ featured
by geohot · Python · 65⭐ ·

Minimal Python code to access and program the Tenstorrent Blackhole chip directly — George Hotz's exploration of TT hardware programmability with pointed commentary on the architecture.

📦 Repo
blackhole low-level exploration
blackhole

tt-sim

community★ featured
by mesham · Python · 13⭐ ·

Community-built Tenstorrent architecture simulator written in Python. Runs without hardware — useful for researchers and developers exploring the Tensix architecture offline.

📦 Repo
LATEST v1.0 2026-05-11T13:07:42Z Release notes ↗
See all releases on GitHub ↗
simulator architecture no-hardware research

tt-iree

community★ featured
by swote-git · C++ · Apache-2.0 · 12⭐ ·

IREE (Intermediate Representation Execution Environment) ML compiler ported to Tenstorrent AI accelerators. Brings the IREE compiler ecosystem to TT hardware.

📦 Repo
iree compiler mlir inference
wormhole blackhole

triton-tenstorrent

community★ featured
by kernelize-ai · C++ · 11⭐ ·

OpenAI Triton compiler plugin for Tenstorrent hardware. Write Triton kernels and target Tensix cores — brings the Triton ML kernel ecosystem to TT devices.

📦 Repo
triton openai-triton compiler kernels
wormhole blackhole

bhx

community★ featured
by olofj · Rust · 4⭐ ·

Boot stock Linux cloud images on the SiFive X280 RISC-V cores inside Tenstorrent Blackhole AI accelerators. Per-card Rust daemon with virtio-mmio block/net/console and U-Boot/EFI support.

📦 Repo
blackhole risc-v linux boot virtio
blackhole

tt-boltz

community★ featured
by moritztng · Python · MIT · Jan 31, 2026

Boltz-2 biomolecular model for drug discovery on Tenstorrent Blackhole. Supports single-card and multi-card configurations — QuietBox (4×) and Galaxy (32×). Approaches physics-based FEP accuracy at 1000× the speed.

drug-discovery blackhole inference biology multi-card
blackhole quietbox galaxy

Programming Tenstorrent Processors

community★ featured
by · Apr 21, 2025

Deep-dive into the Tenstorrent architecture and Metalium programming model — circular buffers, kernel synchronization, NoC routing, and where the footguns are. The honest guide to thinking in Tensix.

metalium programming-model tensix noc circular-buffers blog
wormhole blackhole

Tenstorrent SFPU Kernel Series — Jason Davies

community★ featured
by jasondavies · Nov 12, 2025

Sponsored series of deep technical articles on implementing optimal SFPU kernels for the Tenstorrent Wormhole and Blackhole vector units. Covers where, typecasting, 16/32-bit integer multiplication, cube root, and accurate sin/cos/tan — with cycle counts, assembly walkthroughs, and Blackhole vs Wormhole comparisons throughout.

sfpu assembly vector-unit cycle-counting wormhole blackhole optimization sponsored
wormhole blackhole

Tenstorrent Blackhole Architecture Guide

community★ featured
by · Feb 28, 2026

A 6,500-word community deep dive into the Blackhole p100a architecture: the tile model (Tensix, DRAM, SiFive x280 L2CPU, Ethernet, PCIe, NoC arc), firmware startup sequence, MOP micro-op processor, replay buffer, FPU/SFPU sync, and the anatomy of a kernel. From the author of blackhole-py.

blackhole architecture tensix noc sifive-x280 firmware mop sfpu deep-dive blog
blackhole

grayskull-attention

community
by moritztng · TeX · MIT · 38⭐ ·

FlashAttention-style attention kernel implemented entirely in on-chip SRAM on the Tenstorrent Grayskull chip using TT-Metalium. Pioneering work in low-level attention on TT hardware.

📦 Repo
attention grayskull metalium sram kernel
grayskull

tt-twitch

community
by geohot · C++ · 28⭐ ·

A Tenstorrent Grayskull kernel written live on Twitch by George Hotz. 120-core grid demonstration of live kernel programming.

📦 Repo
grayskull kernel live-coding demo
grayskull

koyeb/tenstorrent-examples

community
by koyeb · Dockerfile · 18⭐ ·

Example applications and deployment configurations for running AI workloads on Tenstorrent hardware via Koyeb's cloud platform.

cloud koyeb deployment examples

blackhole-py

community
by boopdotpng · Python · MIT · 14⭐ ·

Pure Python driver for Tenstorrent Blackhole cards providing direct low-level hardware access without going through the full TT-Metal stack.

📦 Repo
driver python blackhole low-level hardware-access
blackhole

tenstorrent-tiny-examples

community
by jaebaek · C++ · 14⭐ ·

Simple C++ kernel experiments on a GraySkull e75 chip. Hands-on examples for learning the TT-Metal programming model at the metal level.

📦 Repo
examples grayskull cpp learning
grayskull

ttnn-helloworld-cpp

community
by marty1885 · C++ · 14⭐ ·

Minimal working example of using Tenstorrent TTNN in C++. The simplest possible starting point for C++ developers targeting TT hardware with TTNN.

📦 Repo
c++ ttnn hello-world template
wormhole blackhole

TT-GoL

community
by JushBJJ · C++ · 12⭐ ·
TT-GoL preview

Conway's Game of Life implemented on Tenstorrent hardware using TT-Metal kernels.

📦 Repo
game-of-life demo kernels

ttMandelbrot

community
by marty1885 · C · 0BSD · 7⭐ ·

Mandelbrot Set fractal renderer running on Tenstorrent hardware. A classic demo showcasing parallel compute on Tensix cores.

📦 Repo
mandelbrot demo fractals parallel

TT-Metal Mini Template

community
by JushBJJ · C++ · 7⭐ ·

Minimal working CMake project template for starting a new TT-Metal project from scratch. Good starting point for community kernel development.

📦 Repo
template cmake starter boilerplate

tt-tutorial (HPC)

community
by RISCVtestbed · C++ · BSD-3-Clause · 7⭐ ·

Tutorial on Tenstorrent hardware for HPC researchers from the RISC-V Testbed project at Edinburgh/EPCC. Covers Wormhole from an HPC parallel-computing perspective.

📦 Repo
tutorial hpc epcc edinburgh wormhole
wormhole

ttPEAK

community
by TT-Bounty-Hunters · C++ · ISC · 6⭐ ·

clpeak-style peak-performance benchmark for Tenstorrent devices using TT-Metalium. Measures theoretical peak throughput across operations — useful for hardware characterization.

📦 Repo
benchmark performance clpeak metalium
wormhole blackhole

tenstorrent.nix

community
by RossComputerGuy · Nix · LGPL-2.1 · 6⭐ ·

Nix flake packaging the Tenstorrent software stack for NixOS and Nix users. Reproducible, declarative installation of TT drivers and tools.

📦 Repo
nix nixos packaging flake reproducible

current

community
by seansiddens · C++ · 5⭐ ·

High-level parallel programming framework for Tenstorrent accelerators, abstracting TT-Metal into a research-oriented programming model for parallel computation.

📦 Repo
framework parallel abstraction research
wormhole blackhole

ttVecAdd

community
by marty1885 · C++ · ISC · 5⭐ ·

Minimal vector-addition example on Tenstorrent devices using TT-Metalium. A clean hello-world for the TT-Metal kernel programming model in C++.

📦 Repo
vector-add example metalium hello-world

ttas

community
by Zaneham · C · Apache-2.0 · 4⭐ ·

ttas is a hacker-friendly assembler/disassembler for Tensix on Wormhole. It turns assembly into the exact 32-bit words the hardware runs, and turns binaries back into readable instructions using the same shared instruction table.

📦 Repo
LATEST v0.1.0 2026-05-28T07:08:35Z Release notes ↗
v0.0.1 2026-05-27T15:19:11Z
See all releases on GitHub ↗
assembler
wormhole

tt-tutorial (Korean)

community
by changh95 · Jupyter Notebook · 4⭐ ·

Comprehensive tutorials for the Tenstorrent software stack in Korean. Jupyter notebooks covering the full developer path from hardware setup to model inference.

📦 Repo
tutorial korean jupyter getting-started
wormhole

Collective Operations on Wormhole n150 (Sapienza University of Rome)

community

Master's thesis implementing and benchmarking five allreduce algorithms (Swing, Recursive Doubling, Bandwidth Optimal, Latency Optimal, Shared Memory) on the Wormhole n150. Bandwidth Optimal achieved best performance, approaching within 2× of theoretical optimal.

📦 Repo
allreduce collective-ops wormhole mpi bandwidth
wormhole

libtt-metal-cxx

community
by Knight-Ops · Rust · 2⭐ ·

Rust crate that exposes the TT-Metal host API through a C++ bridge via cxx.rs — covering device management, program/kernel creation (from source file or inline string), circular buffers, semaphores, runtime arguments, sharded buffers, and MeshDevice workflows, with hardware-backed integration tests.

📦 Repo
rust bindings cxx tt-metal ffi host-api
wormhole blackhole

gsplat_tt

community
by Kovelja009 · Python ·

Port of Gaussian Splatting (3D scene reconstruction from 2D images) to Tenstorrent hardware.

📦 Repo
gaussian-splatting computer-vision 3d-reconstruction blackhole
blackhole

A Gentle Guide: Tenstorrent Card on Arch Linux with Metalium

community
by · Jul 7, 2024

Step-by-step guide to getting a Tenstorrent card running on Arch Linux with the full Metalium stack. Practical troubleshooting from someone who did it the hard way first.

arch-linux metalium installation blog getting-started
grayskull wormhole

Thoughts and Logs After Messing with Tenstorrent Grayskull

community
by · Jun 2, 2024

Honest field notes from getting a Grayskull card running and writing first Metalium kernels. Covers setup pitfalls, processor hangs, memory protection quirks, and what makes Metalium compelling despite early rough edges.

grayskull metalium getting-started blog honest-review
grayskull

Tenstorrent Architecture — W&M CSCI654 Advanced Computer Architecture

community
by · Oct 9, 2024

Lecture 20 from William & Mary's graduate Computer Architecture course. Frames Tenstorrent in the landscape between GPUs and TPUs, draws comparisons to Cerebras and SambaNova, then dives deep into the Wormhole chip and Tensix core: the 5 RISC-V core design, SFPU, NoC, and dataflow execution model.

lecture architecture wormhole tensix risc-v sfpu noc academia
wormhole

Attention in SRAM on Tenstorrent Grayskull

community
by · Jul 18, 2024

A fused kernel for the Grayskull architecture implementing Transformer self-attention entirely within SRAM. Combines matrix multiply, attention score scaling, and Softmax without DRAM accesses, achieving significant speedups over non-fused implementations.

attention transformer sram grayskull kernel risc-v
grayskull

Exploring Fast Fourier Transforms on the Tenstorrent Wormhole

community
by · Jun 18, 2025

Ports the Cooley-Tukey FFT algorithm to the Wormhole n300 RISC-V accelerator. The Wormhole draws 8× less power and consumes 2.8× less energy than a 24-core Xeon Platinum for a 2D FFT. ISC 2025.

fft wormhole hpc risc-v energy-efficiency epcc
wormhole

Assessing Tenstorrent Grayskull RISC-V MatMul Acceleration for LLMs

community
by · May 9, 2025

Evaluates the Tenstorrent Grayskull e75 RISC-V accelerator for matrix multiplication at reduced numerical precision (BFP8 and LoFi), a fundamental kernel in LLM inference computation.

matmul grayskull risc-v bfp8 lofi llm precision
grayskull

Porting Strategies for Gravitational N-Body Simulations on Tenstorrent Wormhole

community
by · May 4, 2026

Evaluates three strategies for scaling an N-body code across multiple Tenstorrent Wormhole accelerators. Builds on the established performance of single-card N-body work to explore parallelism via the on-chip NoC and multi-accelerator configurations.

n-body astrophysics hpc wormhole risc-v multi-accelerator simulation
wormhole

Accelerating Gravitational N-Body Simulations on Tenstorrent Wormhole

community
Nov 16, 2025

Accelerates an astrophysical N-body simulation on the Wormhole n300. Achieves 2× speedup and 2× energy savings over a highly optimized CPU implementation. SC '25 Workshop.

n-body astrophysics hpc wormhole risc-v simulation
wormhole

Numerical Kernels on a Spatial Accelerator: Tenstorrent Wormhole

community
Mar 24, 2026

Implements three numerical kernels and composes them into a conjugate gradient solver on Wormhole. Demonstrates AI accelerators merit consideration for HPC workloads traditionally dominated by CPUs and GPUs. 2026.

numerical-methods hpc conjugate-gradient wormhole sparse
wormhole

Accelerating Stencils on the Tenstorrent Grayskull RISC-V Accelerator

community
Sep 27, 2024

Explores stencil computation on the Grayskull PCIe RISC-V accelerator. Early academic work examining TT hardware for HPC stencil workloads. 2024.

stencil hpc grayskull risc-v
grayskull

Stencil Computations on Tenstorrent Wormhole

community
May 8, 2026

Maps 2D 5-point stencil computations onto the Tenstorrent Wormhole RISC-V AI dataflow accelerator via two implementations: element-wise decomposition (Axpy) and matrix-multiplication reformulation (MatMul). Profiling shows the isolated Wormhole kernel is competitive with CPU execution, with PCIe transfers and initialization driving end-to-end overhead; Axpy achieves lower energy than the CPU baseline at large scales. Identifies architectural and software directions for making AI accelerators viable for HPC stencil workloads. 2025.

stencil hpc wormhole risc-v energy-efficiency benchmarks dataflow
wormhole

SwiftNPU: Scalable Shape-Flexible Allocation for Inter-Core Connected NPUs

community
Apr 27, 2026

Makes multi-tenant NPU sharing practical for Blackhole-class hardware using polynomial-time allocation algorithms. Delivers up to 1.37× higher utilization and 1.14× faster workload completion. Up to 890,000× faster than NP-hard baselines.

multi-tenant allocation blackhole npu scheduling
blackhole

TileLoom: Automatic Dataflow Planning for Spatial Dataflow Accelerators

community
by · Dec 17, 2025

Compiler system that automatically generates efficient dataflow plans for tile-based languages on spatial accelerators including Tenstorrent Wormhole. Exploits on-chip network forwarding between processing elements to reduce DRAM pressure.

compiler dataflow spatial-accelerator tile-based on-chip-network wormhole
wormhole

Rewriting TTS Inference Economics: Lightning V2 on Tenstorrent vs. NVIDIA L40S

community
by · Mar 24, 2026

Shows that Text-to-Speech inference on Tenstorrent Lightning V2 achieves 4× lower cost than NVIDIA L40S. Applies BlockFloat8 (BFP8) and low-fidelity (LoFi) precision strategies to TTS despite their greater numerical fragility compared to LLMs.

tts text-to-speech inference bfp8 lofi cost-efficiency precision
wormhole

tt-zork-and-more

affiliated ⑂ historicalsource/zork1★ featured
by tsingletaryTT · Python · 2⭐ ·
tt-zork-and-more preview

A Tenstorrent fork of Infocom's Zork I (and more!), running a Z-machine interpreter at least four different ways on TT hardware. The most fun you can have with an AI accelerator.

zork z-machine interactive-fiction demo fun

Local AI Agents on Tenstorrent

affiliated★ featured
by ·

Three agentic projects running fully on-device: local AI agents on QuietBox 2, a coding assistant powered by Aider against a local inference server, and the OpenClaw AI assistant on QuietBox 2. No cloud APIs — all inference runs on TT hardware.

agents local-llm aider coding-assistant quietbox on-device
wormhole blackhole quietbox

Video Generation on Tenstorrent

affiliated★ featured
by ·

Three lesson-projects covering on-device video synthesis: frame-by-frame diffusion with tt-local-generator, native AnimateDiff video animation, and video generation on QuietBox 2. All run entirely on TT hardware with no cloud dependency.

video-generation diffusion animatediff tt-local-generator quietbox on-device
wormhole blackhole quietbox

tensix-viz

affiliated★ featured
by tsingletaryTT · JavaScript ·
tensix-viz preview

Hardware topology visualizer for Tenstorrent chips — from individual chip to full cluster. Interactive JavaScript visualization of Tensix core layout and NoC connections.

visualization topology noc hardware
wormhole blackhole
Blackhole · P100 / P150 / P300c · 140 Tensix cores
Wormhole · N150 / N300 · 64 Tensix cores
mode

Tenstorrent Cookbook: Particle Life Simulator

affiliated★ featured
by ·

Particle Life simulation on Tenstorrent hardware — an emergent-behavior N-body system where simple attraction/repulsion rules between species produce complex lifelike patterns. Cookbook recipe demonstrating parallel N-body compute on Tensix.

particle-life n-body simulation emergent cookbook demo
wormhole blackhole

CS Fundamentals on Tenstorrent Hardware

affiliated★ featured
by ·

Seven-module computer science curriculum taught on real Tenstorrent hardware. Covers RISC-V architecture, memory hierarchy, parallel computing, networks and NoC, synchronization, abstraction layers, and computational complexity — all grounded in what is physically happening on the chip.

computer-science curriculum risc-v parallelism memory noc education
wormhole blackhole

tt-lang-models

affiliated
by zoecarver · Python · 7⭐ ·
tt-lang-models preview

A growing collection of models that use tt-lang for some or all of their implementation. Reference implementations for bringing modern models to the tt-lang DSL.

📦 Repo
tt-lang models dsl reference

tt-qb-lights

affiliated
by tsingletaryTT · Rust · 2⭐ ·

Sync your Tenstorrent Quietbox's RGB lighting to accelerator utilization status. Visual feedback for hardware activity in real time.

📦 Repo
quietbox rgb hardware fun
quietbox

gemma4

affiliated
by zoecarver · Python · 1⭐ ·

Gemma 4 language model implemented in tt-lang (e4b variant) for direct execution on Tenstorrent hardware.

📦 Repo
gemma llm tt-lang inference
blackhole

open-oasis

affiliated ⑂ etched-ai/open-oasis
by zoecarver · Python · 1⭐ ·

tt-lang inference script for Oasis 500M — an interactive video world model running on Tenstorrent hardware via the tt-lang DSL.

📦 Repo
video world-model oasis tt-lang inference
blackhole

tt-model-runner

affiliated
by tsingletaryTT · Python · 1⭐ ·

Discover, load, and benchmark models with a GUI and TUI for tt-inference-server. Makes exploring available models on Tenstorrent hardware as easy as browsing a catalog.

📦 Repo
gui tui models inference benchmark
wormhole blackhole quietbox

tt-claw

affiliated
by tsingletaryTT · Shell ·

A Tenstorrent-powered claw machine that rewards players with real prizes. The QuietBox 2 runs local AI inference to act as an agent controlling the claw hardware — the OpenClaw AI assistant lesson builds directly on this project.

claw-machine agents hardware quietbox physical on-device
quietbox

dflash

affiliated ⑂ z-lab/dflash
by zoecarver · Python ·

DFlash: Block Diffusion for Flash Speculative Decoding on Tenstorrent hardware using tt-lang. Combines block diffusion with speculative decoding for faster inference.

speculative-decoding diffusion tt-lang inference

diamond

affiliated ⑂ eloialonso/diamond
by zoecarver · Python ·
diamond preview

DIAMOND: Atari game-playing agent implemented on Tenstorrent hardware via tt-lang. Diffusion-based world model for reinforcement learning.

atari reinforcement-learning world-model tt-lang

Engram

affiliated ⑂ deepseek-ai/Engram
by zoecarver · Python ·

A Tenstorrent port of the DeepSeek Engram model using tt-lang. Brings DeepSeek's memory-efficient architecture to TT hardware.

📦 Repo
deepseek engram tt-lang inference
blackhole

Stable Diffusion XL on Tenstorrent

affiliated
by ·

On-device image generation with Stable Diffusion XL running entirely on Tenstorrent hardware. Full inference pipeline with no cloud dependency.

stable-diffusion sdxl image-generation diffusion on-device
wormhole blackhole

tt-forge-compiletron

affiliated
by tsingletaryTT · Python ·
tt-forge-compiletron preview

Compile more than 100 models on tt-forge in a display format suitable for demos. Comprehensive showcase of tt-forge model compatibility.

📦 Repo
tt-forge models demo compilation

Image Classification with TT-Forge

affiliated
by ·

End-to-end image classification project using TT-Forge — compile and run a PyTorch classification model on Tenstorrent hardware with no kernel authoring required.

forge image-classification pytorch compiler inference
wormhole blackhole

tt-warp

affiliated
by tsingletaryTT · Python ·

Warp terminal plugin for Tenstorrent — integrates hardware status, model management, and developer workflows directly into the Warp terminal.

📦 Repo
warp terminal plugin developer-experience

Tensix Grid Playground

affiliated
by ·

Interactive browser-based visualizer of the Tenstorrent Tensix grid architecture. Explore the NoC, core layout, and dataflow patterns without hardware — a great companion for learning kernel programming.

visualization interactive noc tensix browser architecture

Tenstorrent Cookbook: Conway's Game of Life

affiliated
by ·

TT-Metalium implementation of Conway's Game of Life as a cookbook recipe. Each generation is a full parallel kernel dispatch over the grid — a clean introduction to stateful compute on Tensix cores.

game-of-life demo cookbook parallel metalium
wormhole blackhole

Custom Model Training on Tenstorrent

affiliated
by ·

Eight-lesson series covering the full custom training workflow on TT hardware: dataset fundamentals, configuration patterns, fine-tuning, multi-device distributed training, experiment tracking, model architecture basics, and training from scratch.

training fine-tuning multi-device distributed experiment-tracking curriculum
wormhole blackhole

Tenstorrent Cookbook: Core Recipes

affiliated
by ·

Three hands-on TT-Metalium kernel recipes: a Mandelbrot fractal explorer, real-time audio signal processing pipeline, and custom image filter stack. Each recipe is a complete kernel project with full source in the lesson.

cookbook mandelbrot audio image-processing metalium demo
wormhole blackhole

tt-bh-linux

official★ featured
C · GPL-2.0 · 53⭐ ·
tt-bh-linux preview

Linux demo for the Tenstorrent Blackhole P100/P150 card RISC-V cores. Boot a real Linux kernel on the 16 high-performance RISC-V cores built into the Blackhole chip.

LATEST v0.11 2026-04-13T15:10:59Z Release notes ↗
v0.10 2026-02-11T22:41:22Z
v0.9 2025-10-14T20:56:23Z
v0.5 2025-10-01T15:40:57Z
v0.4 2025-08-09T18:05:10Z
See all releases on GitHub ↗
linux risc-v blackhole bare-metal boot
blackhole

TT Console

official★ featured

Browser-based cloud console for exploring AI on Tenstorrent hardware. Run LLM inference, image and video generation, and browse the supported model catalog in-browser — backed by Tenstorrent accelerators. Cloud hardware access and advanced workflows (deployments, agents) available in staged rollout.

cloud console inference playground llm image-generation video-generation demo
wormhole blackhole

tt-metal

official
C++ · Apache-2.0 · 1505⭐ ·
tt-metal preview

TT-NN operator library and TT-Metalium low-level kernel programming model. The primary SDK for developing on Tenstorrent hardware — from high-level tensor ops to bare-metal RISC-V kernels.

LATEST v0.72.0 2026-06-09T01:30:48Z Release notes ↗
v0.73.0-dev20260609pre 2026-06-09T03:11:04Z
v0.73.0-dev20260608pre 2026-06-08T02:48:16Z
v0.72.0-rc4pre 2026-06-08T21:04:04Z
v0.72.0-rc3pre 2026-06-08T10:14:35Z
See all releases on GitHub ↗
metalium ttnn sdk kernels core
grayskull wormhole blackhole ttsim

tt-buda

official
Python · Apache-2.0 · 314⭐ ·

TT-BUDA: Tenstorrent's original Python compiler and runtime for AI workloads. Legacy stack — tt-forge is the recommended successor, but tt-buda has the largest model demo library.

📦 Repo
LATEST v0.19.3 2024-09-24T21:01:08Z Release notes ↗
v0.18.2 2024-07-18T15:58:39Z
v0.17.0-alpha 2024-06-05T20:07:29Z
v0.15.0-alpha 2024-05-23T19:53:00Z
v0.12.3 2024-05-10T22:25:40Z
See all releases on GitHub ↗
legacy compiler pytorch buda
grayskull wormhole

tt-mlir

official
C++ · Apache-2.0 · 278⭐ ·

Tenstorrent MLIR compiler — the core compiler infrastructure shared by tt-forge and other frontends. Handles graph optimization, lowering, and code generation for Tensix hardware.

0.9.0.dev20260221pre 2026-02-21T04:31:50Z
0.9.0.dev20260220pre 2026-02-20T04:34:35Z
0.9.0.dev20260219pre 2026-02-19T04:37:24Z
0.9.0.dev20260218pre 2026-02-18T04:38:21Z
0.9.0.dev20260217pre 2026-02-17T04:37:09Z
See all releases on GitHub ↗
mlir compiler backend optimization
wormhole blackhole

tt-forge

official
Python · Apache-2.0 · 268⭐ ·
tt-forge preview

Tenstorrent's MLIR-based compiler frontend. Enables running AI workloads from PyTorch, ONNX, and other frameworks on all Tenstorrent hardware configurations through an open-source, general, and performant compiler.

LATEST 1.3.0.dev20260609002802 2026-06-09T01:16:05Z Release notes ↗
1.3.0.dev20260607003211 2026-06-07T01:27:37Z
1.3.0.dev20260606003110 2026-06-06T01:31:42Z
1.3.0.dev20260605003323 2026-06-05T02:44:37Z
1.3.0.dev20260604011223 2026-06-04T01:55:30Z
See all releases on GitHub ↗
mlir compiler pytorch onnx frontend
wormhole blackhole ttsim

riscv-ocelot

official ⑂ riscv-boom/riscv-boom
SystemVerilog · Apache-2.0 · 253⭐ ·
riscv-ocelot preview

The Berkeley Out-of-Order Machine with V-EXT (RISC-V Vector Extension) support. Tenstorrent's research-grade out-of-order RISC-V core with vector extension.

📦 Repo
risc-v out-of-order vector-extension processor-design

ttsim

official
C++ · Apache-2.0 · 119⭐ ·

Fast full-system simulator of Tenstorrent Wormhole and Blackhole hardware. Runs TT-Metalium workloads on any Linux/x86_64 system without physical silicon. Bit-exact results relative to hardware.

📦 Repo
LATEST v1.8.0 2026-06-09T17:23:20Z Release notes ↗
v1.7.3 2026-06-05T22:44:41Z
v1.7.2 2026-06-03T22:51:02Z
v1.7.1 2026-06-02T19:31:30Z
v1.7.0 2026-05-26T21:52:15Z
See all releases on GitHub ↗
simulator no-hardware bit-exact wormhole blackhole
ttsim

whisper

official ⑂ chipsalliance/VeeR-ISS
C++ · Apache-2.0 · 88⭐ ·

RISC-V Instruction Set Simulator (ISS) used by Tenstorrent for processor verification. Powers the co-simulation architecture checker.

📦 Repo
LATEST 1.861 2026-05-11T15:44:36Z Release notes ↗
See all releases on GitHub ↗
risc-v iss simulator verification

tt-xla

official
Python · Apache-2.0 · 67⭐ ·

PJRT device plugin for Tenstorrent hardware. Enables JAX, PyTorch/XLA, and other XLA-based frameworks to target TT accelerators.

LATEST 1.3.0.dev20260609002802 2026-06-09T01:07:31Z Release notes ↗
1.3.0.dev20260608003328 2026-06-08T01:42:18Z
1.3.0.dev20260607003211 2026-06-07T01:19:51Z
1.3.0.dev20260606003110 2026-06-06T01:23:30Z
1.3.0.dev20260605003323 2026-06-05T02:36:11Z
See all releases on GitHub ↗
xla pjrt jax pytorch
wormhole blackhole

tt-kmd

official
C · GPL-2.0 · 65⭐ ·

Tenstorrent kernel module driver. The Linux kernel module required to interface with Tenstorrent PCIe accelerator cards.

📦 Repo
LATEST ttkmd-2.9.0 2026-06-09T13:25:19Z Release notes ↗
ttkmd-2.9.0-rc1pre 2026-05-26T19:10:10Z
ttkmd-2.8.0 2026-04-06T18:58:39Z
ttkmd-2.8.0-rc1pre 2026-04-04T01:30:29Z
ttkmd-2.7.0 2026-02-09T20:23:07Z
See all releases on GitHub ↗
kernel-module driver linux pcie
grayskull wormhole blackhole

RiESCUE

official
Python · Apache-2.0 · 65⭐ ·

RISC-V Directed Test Framework and Compliance Suite. Comprehensive test infrastructure for verifying RISC-V processor implementations against the specification.

LATEST v1.7.0 2025-12-03T19:29:44Z Release notes ↗
v1.5.0 2025-11-17T21:58:14Z
v1.3.0 2025-11-06T20:12:13Z
v1.1.2 2025-10-16T17:21:43Z
v0.2.5 2025-07-10T00:59:12Z
See all releases on GitHub ↗
risc-v testing compliance verification

tt-buda-demos

official
Python · Apache-2.0 · 64⭐ ·

Repository of model demos using TT-Buda. The largest collection of pre-compiled model examples for Tenstorrent hardware — BERT, ResNet, YOLO, GPT-2, Whisper, and many more.

📦 Repo
demos models bert resnet yolo gpt2
grayskull wormhole

tt-smi

official
Python · Apache-2.0 · 61⭐ ·
tt-smi preview

Tenstorrent System Management Interface — monitor device telemetry, issue board-level resets, and inspect hardware health. The nvidia-smi equivalent for Tenstorrent hardware.

📦 Repo
LATEST v5.2.0 2026-05-14T17:26:26Z Release notes ↗
v5.1.1 2026-05-12T22:18:05Z
v5.1.0 2026-05-11T16:23:13Z
v5.0.1 2026-04-24T11:39:48Z
v5.0.0 2026-04-01T21:13:31Z
See all releases on GitHub ↗
monitoring telemetry smi hardware-management
grayskull wormhole blackhole

tt-inference-server

official
Python · Apache-2.0 · 57⭐ ·

Production-ready model serving for Tenstorrent hardware with OpenAI-compatible REST API. Supports continuous batching, multiple models, and all TT hardware configurations.

LATEST v0.15.0 2026-05-29T15:55:11Z Release notes ↗
v0.14.0 2026-05-15T22:34:02Z
v0.13.0 2026-04-24T20:21:26Z
v0.10.1 2026-04-08T09:58:17Z
v0.12.0 2026-04-02T21:50:57Z
See all releases on GitHub ↗
serving openai-compatible production rest-api
wormhole blackhole quietbox galaxy

ttnn-visualizer

official
TypeScript · Apache-2.0 · 52⭐ ·
ttnn-visualizer preview

Comprehensive tool for visualizing and analyzing model execution on Tenstorrent hardware. Interactive graphs, memory plots, tensor details, buffer overviews, operation flow graphs, and multi-instance support.

📦 Repo
LATEST v0.88.0 2026-06-03T20:23:29Z Release notes ↗
v0.87.0 2026-05-27T17:30:12Z
v0.86.0 2026-05-20T18:34:19Z
v0.85.0 2026-05-13T20:31:18Z
v0.84.1 2026-05-07T00:44:46Z
See all releases on GitHub ↗
visualization profiling memory operations graphs
wormhole blackhole

tt-lang

official
Python · Apache-2.0 · 51⭐ ·
tt-lang preview

Python-based DSL that sits between TT-NN and TT-Metalium — expresses custom fused kernels with progressive disclosure, compiling directly to Tensix. Ships an integrated functional simulator (no hardware needed), line-by-line performance metrics, and AI-agent-friendly tooling. Two packages: tt-lang (compiler + hardware, requires ttnn) and tt-lang-sim (simulator only, works on Linux/macOS without Tenstorrent hardware).

dsl python kernels tt-lang simulator kernel-fusion
wormhole blackhole ttsim

tt-llk

official
C++ · Apache-2.0 · 51⭐ · Jun 5, 2025

Tenstorrent Low-Level Kernels: the C++ library that directly programs the RISC-V cores inside each Tensix compute engine. TRISC0 (unpack), TRISC1 (math/FPU/SFPU), and TRISC2 (pack) are all programmed through this layer — it is the interface between TT-Metal kernel code and bare silicon.

tensix risc-v llk trisc brisc ncrisc low-level compute-engine
grayskull wormhole blackhole

TT-Studio

official
TypeScript · Apache-2.0 · 48⭐ ·

Web-based GUI for deploying and chatting with AI models on Tenstorrent hardware. Handles all technical setup automatically — deploy models, run inference, and explore capabilities through a simple browser interface.

📦 Repo
LATEST v2.6.0 2026-05-20T17:04:32Z Release notes ↗
v2.5.0 2026-04-20T17:03:48Z
v2.4.1 2026-03-24T15:09:57Z
v2.1.0 2025-10-04T01:33:59Z
v2.0.1 2025-07-21T19:53:40Z
See all releases on GitHub ↗
web-ui gui models chat deployment
wormhole blackhole quietbox

WallaBMC

official
C · Apache-2.0 · 46⭐ ·
WallaBMC preview

Lightweight BMC (Baseboard Management Controller) for STM32 and similar MCUs, with Web UI, Redfish API, and HTTPS support. Built on Zephyr RTOS. Used in Tenstorrent systems.

📦 Repo
bmc stm32 redfish zephyr embedded

tt-umd

official
C++ · Apache-2.0 · 42⭐ ·

User-mode driver for Tenstorrent hardware. The userspace layer that sits between the kernel module and higher-level SDKs.

📦 Repo
v0.9.6pre 2026-06-03T10:59:12Z
v0.9.5-dev.260424pre 2026-04-30T10:50:47Z
v0.9.4pre 2026-03-18T21:42:11Z
v0.9.3pre 2026-02-24T18:29:21Z
v0.9.1pre 2026-01-23T22:54:19Z
See all releases on GitHub ↗
user-mode-driver umd hardware-interface
grayskull wormhole blackhole

tt-system-firmware

official
C · Apache-2.0 · 39⭐ ·
tt-system-firmware preview

System firmware for Tenstorrent hardware. Low-level system initialization and control firmware that runs on-device.

LATEST v19.10.0 2026-06-01T13:21:59Z Release notes ↗
v19.11.0-rc1pre 2026-06-05T19:29:28Z
v19.10.0-rc2pre 2026-05-27T12:36:30Z
v19.10.0-rc1pre 2026-05-08T19:27:36Z
v19.9.0 2026-04-22T14:36:09Z
See all releases on GitHub ↗
firmware system embedded
wormhole blackhole

luwen

official
Rust · Apache-2.0 · 34⭐ ·

Tenstorrent system interface library written in Rust. Low-level Rust bindings for communicating with and managing TT hardware.

📦 Repo
LATEST v0.8.5 2026-03-30T21:03:56Z Release notes ↗
v0.8.4 2026-03-26T19:34:59Z
v0.8.3 2026-03-26T16:02:34Z
v0.8.2 2026-03-23T18:58:20Z
v0.8.1 2025-12-17T21:16:21Z
See all releases on GitHub ↗
rust system-interface low-level bindings
grayskull wormhole blackhole

tt-tvm

official
Python · Apache-2.0 · 31⭐ ·

TVM for Tenstorrent ASICs. Brings the Apache TVM compiler stack to Tenstorrent hardware, enabling model compilation from TensorFlow, PyTorch, ONNX, and more.

📦 Repo
tvm compiler tensorflow onnx
grayskull wormhole blackhole

tensix-isa-simulator

official
C++ · Apache-2.0 · 29⭐ ·

ISA-level simulator for the Tensix compute engine. Simulates the matrix, vector, and scalar units inside each Tensix core.

📦 Repo
tensix isa simulator compute-engine
ttsim

tt-torch

official
Python · Apache-2.0 · 25⭐ ·

Frontend integration for PyTorch with tt-mlir. Compile PyTorch models directly to Tenstorrent hardware via torch.compile integration.

0.5.0.dev20251008pre 2025-10-08T05:36:07Z
0.5.0.dev20251007pre 2025-10-07T04:22:29Z
0.5.0.dev20251006pre 2025-10-06T04:21:23Z
0.5.0.dev20251005pre 2025-10-05T04:38:19Z
0.5.0.dev20251004pre 2025-10-04T04:22:15Z
See all releases on GitHub ↗
pytorch torch-compile frontend
wormhole blackhole

tt-firmware

official
Apache-2.0 · 24⭐ ·

Tenstorrent firmware repository. Board management and control firmware for Tenstorrent accelerator cards.

📦 Repo
LATEST v19.6.0 2026-02-20T16:53:34Z Release notes ↗
v19.5.0 2026-02-04T18:22:15Z
v19.4.2 2026-01-05T23:32:14Z
v19.4.1 2025-12-19T17:06:37Z
v19.4.0 2025-12-16T05:38:23Z
See all releases on GitHub ↗
firmware bmc board-management
wormhole blackhole

tt-installer

official
Shell · Apache-2.0 · 23⭐ ·

Install the complete Tenstorrent software stack with one command. Handles drivers, firmware, Python environment, and SDK setup automatically.

LATEST v2.2.1 2026-03-16T18:54:29Z Release notes ↗
v2.2.0 2026-03-10T19:52:29Z
v2.1.0 2026-01-14T19:34:46Z
v2.0.0 2025-12-05T20:38:41Z
v1.11.0 2025-12-02T20:02:43Z
See all releases on GitHub ↗
installation setup one-command getting-started
wormhole blackhole

tt-exalens

official
Python · Apache-2.0 · 21⭐ ·

Low-level hardware debugger for Tenstorrent devices. Inspect register state, memory contents, and kernel execution at the hardware level.

📦 Repo
v0.3.21pre 2026-06-05T13:13:06Z
v0.3.20pre 2026-05-30T10:14:29Z
v0.3.19pre 2026-05-13T09:25:52Z
v0.3.18pre 2026-05-11T19:41:11Z
v0.3.17pre 2026-04-24T12:28:18Z
See all releases on GitHub ↗
debugger low-level hardware registers
wormhole blackhole

tt-topology

official
Python · Apache-2.0 · 16⭐ ·
tt-topology preview

Configure Ethernet routing on multi-card Tenstorrent systems. Flash NB cards to use specific ETH routing configurations for scale-out deployments.

📦 Repo
LATEST v1.2.19 2026-02-26T21:14:41Z Release notes ↗
v1.2.18 2026-01-30T22:07:17Z
v1.2.17 2026-01-29T19:20:46Z
v1.2.16 2025-12-08T16:43:42Z
v1.2.15 2025-11-04T16:03:26Z
See all releases on GitHub ↗
topology ethernet multi-card routing
wormhole blackhole

tt-npe

official
C++ · Apache-2.0 · 14⭐ ·
tt-npe preview

Network-on-chip Performance Estimator for Tenstorrent Tensix-based devices. Model and estimate NoC utilization before running kernels on hardware.

📦 Repo
noc performance estimator profiling
wormhole blackhole

tt-blacksmith

official
Python · Apache-2.0 · 13⭐ ·

Optimized training recipes for a variety of ML models on Tenstorrent hardware, powered by the TT-Forge compiler stack. Reference implementations for fine-tuning and training from scratch.

training fine-tuning recipes pytorch
wormhole blackhole

tt-example-apps

official
Jupyter Notebook · Apache-2.0 · 13⭐ ·

End-to-end AI applications running on Tenstorrent AI accelerators. Complete application examples from retrieval-augmented generation to image generation pipelines.

📦 Repo
rag applications end-to-end examples
wormhole blackhole

tt-flash

official
Python · Apache-2.0 · 13⭐ ·

Tenstorrent firmware update utility. Flash new firmware onto Tenstorrent accelerator cards from the command line.

📦 Repo
LATEST v3.8.0 2026-06-01T18:04:27Z Release notes ↗
v3.7.0 2026-05-15T19:32:29Z
v3.6.5 2026-04-16T19:43:11Z
v3.6.4 2026-04-10T14:44:44Z
v3.6.3 2026-04-08T15:38:59Z
See all releases on GitHub ↗
firmware-update flash utility
grayskull wormhole blackhole

tt-vscode-toolkit

official
TypeScript · Apache-2.0 · 7⭐ · Dec 18, 2025
tt-vscode-toolkit preview

48 interactive lessons covering the full Tenstorrent developer path — from hardware detection to custom training — with click-to-run commands and hardware auto-detection. Available in VSCode and code-server.

LATEST v0.0.454 2026-06-05T18:44:37Z Release notes ↗
v0.0.453 2026-05-29T17:21:23Z
v0.0.447 2026-05-18T22:27:56Z
v0.0.438 2026-05-11T16:43:12Z
v0.0.428 2026-05-01T15:18:19Z
See all releases on GitHub ↗
vscode lessons interactive getting-started code-server
wormhole blackhole quietbox ttsim

tt-toplike

official
Rust · Apache-2.0 · 2⭐ ·
tt-toplike preview

A vibrant htop-style visualizer for Tenstorrent hardware written in Rust. Real-time process and utilization view for TT accelerators.

LATEST v0.6.2 2026-06-08T19:42:59Z Release notes ↗
v0.6.1 2026-06-02T23:46:12Z
v0.6.0 2026-05-26T23:49:21Z
v0.5.0 2026-04-29T17:39:43Z
v0.4.3 2026-04-25T20:02:37Z
See all releases on GitHub ↗
monitoring htop rust real-time
wormhole blackhole

tt-local-generator

official
Python · Apache-2.0 · 1⭐ ·
tt-local-generator preview

Generate infinite videos and images (and imaginative prompts to inspire them) on Tenstorrent's Quietbox 2. Fully local generative media pipeline.

LATEST v0.3.4 2026-05-26T23:52:15Z Release notes ↗
v0.3.3 2026-05-26T17:04:35Z
v0.2.6 2026-05-07T18:20:34Z
v0.2.2 2026-04-27T16:26:06Z
v0.2.1 2026-04-25T21:46:17Z
See all releases on GitHub ↗
video-generation image-generation quietbox generative
quietbox

tt-animatediff

official
Python · Apache-2.0 ·
tt-animatediff preview

Generates short, temporally coherent animated GIFs using the AnimateDiff model on Tenstorrent hardware. Phase 1 runs the correct SD 1.4 + MotionAdapter architecture on CPU; Phase 2 accelerates spatial denoising on Blackhole using the TTNN UNet. Produces vibrant 8-frame animations in ~15 s/frame on a P300C.

LATEST v0.1.0 2026-06-04T22:31:14Z Release notes ↗
See all releases on GitHub ↗
animatediff video-generation stable-diffusion diffusion gif blackhole
blackhole