What Comes Next

You unboxed a machine that most people have never touched. You confirmed four Blackhole chips were alive and talking to the system. You navigated Python environments that would trip up someone who wasn’t paying attention. You ran a model on accelerator hardware and watched tokens come back. That’s not a tutorial warmup — that’s the actual thing.

The rest is up to you.

Inference stack diagram showing the path from user interfaces through tt-inference-server and vLLM down to four Blackhole chips

Tools in Your World

The QB2 ships with a full stack, but the ecosystem is bigger. Start with tt-toplikehtop for your chips, except the telemetry comes alive as ASCII art:

tt-toplike insights mode — live ASCII visualization of all four Blackhole chips during inference
tt-toplike insights mode — all four Blackhole chips under live inference, power and DRAM state rendered in real time

Where to Go From Here

Pick a thing you want to do and jump straight in.

Choose Your Next Track

Run & build →
Serve real models. Understand performance. Integrate with your existing ML workflow. If you're coming from CUDA, this is where the familiar parts live and where the new parts pay off.
Tinker →
Write code that runs on the chips directly — kernels, data movement, compute pipelines. The architecture goes all the way down and you can follow it.
Customize →
Customize, illuminate, break, and fix things. The LEDs, the desktop, the demos that make people stop and ask what that machine is.

The QB2 is a beginning. There’s a lot of surface area here, and you’ve only scratched it.


← Back to Explore