·∘○◉●◉○∘· ·∘○◉●◉○∘· ·∘○◉● ◉●◉○∘· ·∘○◉●◉○∘· ·∘○◉ ●◉○∘·∘○◉● ◉●∘· ·∘○◉●◉○∘· ·∘○◉

Tenstorrent Hardware Monitor · v0.5.0 · Rust

Watch
inference
happen.

A psychedelic, ASCII-native terminal monitor for Tenstorrent Blackhole and Wormhole silicon. Tensix cores pulse with real power consumption. DRAM channels reveal their training state. NoC traffic flows as living particle streams. Every pixel earns its place.

Rust 2021 · MIT/Apache-2.0 · Debian package available  ·  works on N150, N300, P150, P300, QB2

See it live

Hardware monitoring as performance art

Four Blackhole chips, idle and under inference load, rendered simultaneously. Watch the Tensix cores brighten as power climbs, DRAM channels report training state, and the NoC particle stream shifts with real traffic.

Watch on YouTube →

You put four Blackhole chips in a rack.
You fire up an LLM inference job.
And then — nothing. A blinking cursor. A log file.

We wanted to see it. Watch the DRAM channels train one by one at boot. Watch Tensix cores light up when the attention heads fire. See the NoC traffic as a river of particles moving between chiplets. Watch the temperature blush orange as the workload grows.

Not numbers in a table. Not a dashboard with gauges. A living picture of inference in motion.

What you see

Every pixel earns its place

There are no decorative animations. Every visual element is driven by real telemetry from the hardware — if it's moving, something changed on the chip.

Tensix Core Grid

Each star in the starfield is a real Tensix core. Its brightness tracks power consumption relative to the learned idle baseline. Its color shifts from cool cyan through yellow to hot red as temperature rises. The twinkle rate follows current draw. 224 cores on Blackhole, every one individually represented.

DRAM Channel Status

The DDR training bitmask from SMBUS telemetry is parsed per-channel and rendered in real time: ○ untrained, ◐ training (animated), ● trained, ✗ error. On Blackhole that's 12 channels — you can watch them come up one by one during the boot sequence.

Memory Hierarchy

The Memory Dungeon renders all four levels simultaneously: DDR at the base, L2 cache banks in the middle, L1 SRAM cores above, Tensix compute at the top. Four distinct particle types — Read ○◉, Write □■, CacheHit ◇◆, Miss ●⬤ — flow upward as real transactions complete.

NoC Particle Flow

The Memory Flow visualization maps the Network-on-Chip as a 2D field. Particles stream between the DDR perimeter and the Tensix core array, their density proportional to traffic, their color to temperature. Watch the flow change as inference shifts attention layers.

Inter-Chip Topology

Hardware topology is auto-detected from board_type. Dual-chip carrier boards (p300, n300, QB2) show board groupings with ═══ / ←→ inter-chip links. Independent PCIe cards (p150a, n150) display as equal-weight columns with no spurious board labels — the board concept is suppressed entirely. Scales from 2 to 32+ chips: side-by-side columns up to terminal width, then a compact fleet grid with one power bar per chip.

@

Arcade Hero

Arcade mode places a roguelike hero character at the intersection of power (vertical) and current (horizontal). Its color is temperature. It leaves a trail of recent positions. It's your avatar inside the chip — moving through DDR, L2, L1, and Tensix as the workload climbs.

Visualization modes

Press v to cycle

Each mode is a different lens on the same live hardware data. All modes support multi-device layouts that scale with your terminal width and chip count.

normal

Telemetry Table

Classical htop-style table: power, temperature, current, voltage, AICLK, ARC firmware health per device. Color-coded with progressive gradients. Backend switcher (b) cycles Sysfs → JSON → Luwen live.

starfield

Tensix Starfield

The original visualization — Tensix cores as stars, memory hierarchy as orbiting planets, data flow streams between devices. Learns each device's idle baseline over the first 20 frames; all activity shown relative to that learned state so any hardware looks alive, from 5W idle to 150W inference.

castle

Memory Castle

A roguelike dungeon of memory hierarchy. 600 particles fill the screen, flowing from DDR at the bottom up through L2, L1, and into Tensix compute. 30 environmental glyphs (⚡ ※ ☼ ◊) populate the dungeon atmosphere. Adaptive spawning surges with power activity.

flow

Memory Flow

The NoC as a particle system. DDR channels ring the perimeter; the Tensix core grid fills the center. Particle density tracks traffic, color tracks temperature, speed tracks current draw. Most meditative mode — watch the inference rhythm emerge.

arcade

Arcade Mode

All three visualizations stacked vertically with animated separator bands. Starfield on top, Memory Castle in the middle, Memory Flow below. A hero character (@) navigates the chip in real time. Maximum information density — the full picture at once.

topology

Board Topology

On multi-chip boards, renders the physical chip layout with Ethernet and PCIe connections. Link status, firmware health, and per-chip temperature shown as a live graph. Immediately shows which chip is hot, which link is down, or which die is carrying the load.

Lineage & inspiration

Standing on the shoulders of
beautiful tools

Every design decision in tt-toplike traces back to something that came before it — a game, a tool, a film, an art installation. Here's the direct lineage.

It started with a question: why does every hardware monitoring tool look like a 1995 spreadsheet? The hardware underneath is doing something genuinely spectacular — billions of SRAM operations per second, twelve DDR channels training in parallel, a mesh network-on-chip routing inference tokens across 224 Tensix cores — and the best we offer the operator is a table of numbers that refreshes every five seconds.

The Python ancestor of this tool (tt-top) proved the concept: a terminal can be a window into the chip. tt-toplike reimplements that vision in Rust, pushing the frame rate to 100ms, adding the adaptive baseline system that makes every machine look alive regardless of its power envelope, and extending the visualization language to cover things tt-top never attempted — DRAM training state, inter-chip topology, the full memory hierarchy rendered as a living dungeon.

The design vocabulary draws from a very specific set of predecessors. Not dashboards. Not Grafana panels. Games, art tools, and the terminal programs that dared to be beautiful.

Technical foundation

Built on safe defaults and live telemetry

tt-toplike is a Rust library with multiple frontends — a TUI binary, a PTY-hosted native window, and a hybrid backend that streams telemetry from tt-smi without ever touching PCI BAR0 while a workload is running.

🛡

Non-Invasive by Default

The default backend chain is Sysfs → JSON (tt-smi subprocess) → Mock. Direct PCI access via Luwen is only attempted when --backend luwen is passed explicitly. You can monitor an LLM inference job without any risk of disturbing the hardware state.

Streaming SMBUS at 1.5s

The HybridBackend spawns a persistent tt-smi subprocess that loops every 1.5 seconds, writing RS-delimited JSON records to a pipe. A reader thread parses each record and deposits it via an Arc-swap pointer — zero mutex contention on the render path, no 5-second polling surge.

EMA Smoothing

Numeric SMBUS fields (temperature, power, clock frequencies) are blended through an exponential moving average (α=0.25) as each new snapshot arrives. Hard value jumps are distributed across ~4 frames, eliminating the visual discontinuity that makes monitoring tools feel choppy.

🖥

tt-toplike-app

A native egui window that hosts the full TUI inside a PTY — no terminal emulator needed, no font configuration, no rendering artifacts. Launches tt-toplike-tui as a child process, implements a complete VT100/VT220 screen renderer including deferred auto-wrap, and draws each cell via egui with full color fidelity.

🔀

Live Backend Switching

Press b in the TUI to cycle backends without restarting. The visualization state is preserved; only the data source changes. Useful for comparing Sysfs sensor readings against full SMBUS telemetry side-by-side during the same session.

📦

Debian Package

Ships as two Debian packages: tt-toplike (the TUI, ~750 KB stripped) and tt-toplike-app (the native window, ~4 MB with egui). All crates are vendored for offline reproducible builds. Recommends tt-smi and tenstorrent-dkms from the Tenstorrent PPA.

Get started

Install

Grab the Debian package for the fastest path. Or build from source with Cargo.

1

Download the packages that match your Ubuntu version. Two variants ship per release: _noble.deb for Ubuntu 24.04+, _jammy.deb for Ubuntu 22.04+.

SUITE=$(. /etc/os-release && echo "$UBUNTU_CODENAME") [ "$SUITE" = "noble" ] || SUITE="jammy" gh release download --repo tenstorrent/tt-toplike --pattern "*_amd64_${SUITE}.deb"
2

Install both packages:

sudo dpkg -i tt-toplike_*_amd64_${SUITE}.deb tt-toplike-app_*_amd64_${SUITE}.deb
3

Run with mock hardware to verify — or on real Tenstorrent silicon:

tt-toplike --mock --mock-devices 4 # no hardware needed tt-toplike # auto-detects real hardware tt-toplike --mode arcade # start in arcade mode tt-toplike-app # native window (PTY-hosted TUI)
4

Or build from source (requires Rust 1.75+):

git clone https://github.com/tenstorrent/tt-toplike cd tt-toplike cargo build --release --bin tt-toplike-tui --features tui ./target/release/tt-toplike-tui --mock --mock-devices 4
Keyboard shortcuts: v cycle viz mode b switch backend q / ESC quit r force refresh