Version Compatibility Matrix
Last updated: January 2026
This guide documents validated combinations of hardware, software versions, and configurations for the Tenstorrent ecosystem. Use this to troubleshoot compatibility issues or plan your development environment.
🎯 Quick Recommendations by Use Case
Just Starting Out (Lessons 1-5)
Hardware: N150 (Wormhole single-chip) tt-metal: Latest from main branch Python: 3.10 (system default on Ubuntu 22.04) Model: Qwen3-0.6B (1.5GB, no HuggingFace token needed)
Production Inference (vLLM)
Hardware: N150/N300/T3K/P100/P150 Deployment: tt-inference-server Docker image (recommended) Alternative: Native installation requires careful version matching
Multi-Chip Development (TT-XLA)
Hardware: N150/N300/T3K/Galaxy Python: 3.11 Installation: Wheel-based (no source build required)
Experimental Compiler (TT-Forge)
Hardware: N150 only (single-chip) Python: 3.11 Build time: 45-60 minutes Requirements: clang-17
🖥️ Hardware Configurations
Wormhole Architecture
N150 (Single Chip)
- DRAM: 12GB
- Tensix Cores: 80
- Best for: Development, small models (<2B parameters)
- Recommended models:
- Qwen3-0.6B (0.6B params, 1.5GB) ✅ Primary recommendation
- Gemma 3-1B-IT (1B params, 2GB)
- Llama-3.1-8B-Instruct (8B params, 16GB) ⚠️ Tight fit, may exhaust DRAM
- Multi-chip support: No
N300 (Dual Chip)
- DRAM: 24GB (2x 12GB)
- Tensix Cores: 160
- Best for: Medium models (8B parameters)
- Recommended models:
- Llama-3.1-8B-Instruct (8B params, 16GB) ✅ Comfortable
- Qwen3-8B (8B params)
- Multi-chip support: Yes (2 chips)
T3K (8 Chips)
- DRAM: 96GB (8x 12GB)
- Tensix Cores: 640
- Best for: Large models (70B+ parameters)
- Recommended models:
- Llama-3.1-70B
- Large-scale inference workloads
- Multi-chip support: Yes (8 chips)
QuietBox (Wormhole-based)
- Architecture: Wormhole (not Blackhole)
- Configuration: Multi-chip Wormhole system
- Production validation: ✅ Validated for vLLM (Batch 32: 22.1 T/S/U, 707.2 T/S)
- Best for: Production inference deployments
- Reference: Tenstorrent QuietBox
- vLLM compatibility: Fully validated (see README.md performance benchmarks)
Blackhole Architecture
P100 (Single Chip)
- DRAM: ~32GB
- Tensix Cores: 140 (14x10 grid, 13x10 available for compute)
- Best for: Next-generation single-chip performance
- Recommended models:
- Qwen3-0.6B (0.6B params, 1.5GB) ✅ Works great
- Llama-3.1-8B-Instruct (8B params, 16GB) ✅ Comfortable fit
- Enhanced features:
- L1 data cache: 1464 KB with 4x16B cachelines (write-through)
- Enhanced NoC: 64B reads (vs 32B on Wormhole)
- Ethernet: 14 cores with 512KB L1, 2x RISC-V per core
- DRAM: 8 banks with programmable RISC-V, 128KB L1 per bank
- Requirements:
export TT_METAL_ARCH_NAME=blackhole - Multi-chip support: Single chip only
P150 (Configurable: 1, 2, 4, or 8 chips)
- DRAM per chip: ~32GB
- Total DRAM:
- P150 x1: ~32GB (single chip)
- P150 x2: ~64GB (2 chips)
- P150 x4: ~128GB (4 chips)
- P150 x8: ~256GB (8 chips)
- Tensix Cores per chip: 140 (14x10 grid, 13x10 available for compute)
- Best for: Scalable multi-chip deployments (70B+ models)
- Recommended models:
- P150 x1: Llama-3.1-8B ✅
- P150 x2: Llama-3.1-8B (fast), medium models ✅
- P150 x4: Llama-3.1-70B ✅
- P150 x8: 70B+ models, large-scale inference ✅
- Enhanced features: Same as P100
- Requirements:
export TT_METAL_ARCH_NAME=blackhole - Multi-chip support: Yes (2, 4, or 8 chips via mesh topology)
Galaxy (Multi-Node)
- Configuration: Multiple T3K nodes
- Best for: Massive-scale distributed training/inference
- Multi-chip support: Yes (distributed)
📦 Software Stack Versions
Core Components (Lessons 1-10)
| Component | Version | Python | Installation Method | Notes |
|---|---|---|---|---|
| tt-metal | Latest (main branch) | 3.10 | Source build | Core low-level API |
| TTNN | Bundled with tt-metal | 3.10 | Included | High-level neural network ops |
| OpenMPI ULFM | 5.0.7 | N/A | System package | Required for all hardware |
| PyTorch | 2.x | 3.10 | pip (in venv) | ML framework |
| Transformers | Latest | 3.10 | pip (in venv) | HuggingFace models |
Environment variables required:
export TT_METAL_HOME=~/tt-metal
export PYTHONPATH=$TT_METAL_HOME:$PYTHONPATH
export LD_LIBRARY_PATH=/opt/openmpi-v5.0.7-ulfm/lib:$LD_LIBRARY_PATH
export MESH_DEVICE=N150 # or N300, T3K, P100, P150, GALAXY
vLLM Production Inference (Lesson 7)
| Deployment Method | Hardware | Status | Notes |
|---|---|---|---|
| tt-inference-server (Docker) | N150/N300/T3K/P100/P150 | ✅ Recommended | Pre-validated configurations |
| Native installation | N150/N300/T3K | ⚠️ Advanced | Version compatibility challenges |
Docker method (validated):
# Uses pre-built image with matched versions
# See Lesson 6 for tt-inference-server
# See Lesson 7 for manual vLLM setup
Native installation compatibility matrix:
| Hardware | tt-metal | vLLM | Status | Notes |
|---|---|---|---|---|
| N150 | Latest (main) | Docker image | ✅ Validated | Use Docker |
| N150 | Specific commits | Native build | ⚠️ Complex | Requires model_specs_output.json matching |
| N300+ | Latest (main) | Docker image | ✅ Validated | Use Docker |
Known issues with native installation:
- PyTorch type hint incompatibilities
- vLLM version mismatches with tt-metal changes
- Complex dependency chains
- Recommendation: Use Docker unless you have specific requirements for native installation
Environment variables (vLLM):
export VLLM_TARGET_DEVICE=tt
export VLLM_CONFIGURE_LOGGING=1
export VLLM_RPC_TIMEOUT=900000
# For Blackhole (P100/P150):
export TT_METAL_ARCH_NAME=blackhole
TT-XLA JAX Compiler (Lesson 12)
| Component | Version | Python | Installation Method | Hardware Support |
|---|---|---|---|---|
| TT-XLA | Latest wheel | 3.11 | pip (wheel) | N150/N300/T3K/Galaxy |
| JAX | 0.7.1+ | 3.11 | pip | Required dependency |
| tt-forge | Cloned for demos | 3.11 | git clone | Demo code only |
Status: ✅ Production-ready for multi-chip
Installation:
# Python 3.11 required
python3.11 -m venv ~/tt-xla-venv
source ~/tt-xla-venv/bin/activate
pip install pjrt-plugin-tt --pre --upgrade --extra-index-url https://pypi.eng.aws.tenstorrent.com/
Environment isolation (CRITICAL):
# MUST unset tt-metal variables
unset TT_METAL_HOME
unset LD_LIBRARY_PATH
export PYTHONPATH=~/tt-forge:$PYTHONPATH # For demo imports only
Use the helper script:
source ~/tt-scratchpad/setup-tt-xla.sh
TT-Forge MLIR Compiler (Lesson 11)
| Component | Version | Python | Installation Method | Hardware Support |
|---|---|---|---|---|
| TT-Forge | Source build | 3.11 | cmake (45-60 min) | N150 only |
| clang | 17 | N/A | apt | Required compiler |
| LLVM | Built from submodule | N/A | cmake | 6719 targets (~40 min) |
| JAX | 0.7.1 | 3.11 | pip | Build dependency |
Status: ⚠️ Experimental (as of December 2025)
Build requirements:
# Install prerequisites
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install -y python3.11 python3.11-venv python3.11-dev clang-17
# Create compiler symlinks (CRITICAL)
sudo update-alternatives --install /usr/bin/clang clang /usr/bin/clang-17 100
sudo update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-17 100
Build time: 45-60 minutes (LLVM compilation is slow)
Environment setup (CRITICAL):
# MUST unset tt-metal variables
unset TT_METAL_HOME
unset TT_METAL_VERSION
# MUST use absolute paths (CMake doesn't expand ~)
export TTFORGE_TOOLCHAIN_DIR=/home/$USER/ttforge-toolchain
export TTMLIR_TOOLCHAIN_DIR=/home/$USER/ttmlir-toolchain
export TTFORGE_PYTHON_VERSION=python3.11
export CC=/usr/bin/clang-17
export CXX=/usr/bin/clang++-17
Use the helper script:
source ~/tt-scratchpad/setup-tt-forge.sh
Model support: 169 validated models in tt-forge-models repository
- MobileNetV1/V2/V3 (✅ Recommended starting point)
- ResNet variants (✅ Validated)
- Some BERT models (⚠️ Check repository)
Stable Diffusion 3.5 (Lesson 9)
| Component | Hardware | Status | Notes |
|---|---|---|---|
| SD 3.5 Large | N150/N300/T3K/P100 | ✅ Validated | 1024x1024 generation |
| Generation time | N150 | ~2-3 minutes | First run, includes model load |
| Environment | Standard tt-metal | ✅ Works | No special setup needed |
No special version requirements - uses standard tt-metal environment.
🔧 Environment Variable Reference
Always Required (Lessons 1-10)
# Point to tt-metal installation
export TT_METAL_HOME=~/tt-metal
# Add tt-metal to Python import path
export PYTHONPATH=$TT_METAL_HOME:$PYTHONPATH
# Add OpenMPI libraries (CRITICAL - #1 most common error)
export LD_LIBRARY_PATH=/opt/openmpi-v5.0.7-ulfm/lib:$LD_LIBRARY_PATH
# Specify hardware type
export MESH_DEVICE=N150 # or N300, T3K, P100, P150, GALAXY
Hardware-Specific
For Blackhole chips (P100/P150):
export TT_METAL_ARCH_NAME=blackhole
Application-Specific
vLLM:
export VLLM_TARGET_DEVICE=tt
export VLLM_CONFIGURE_LOGGING=1
export VLLM_RPC_TIMEOUT=900000
Stable Diffusion (non-interactive):
export NO_PROMPT=1
TT-XLA (isolation required):
unset TT_METAL_HOME
unset LD_LIBRARY_PATH
export PYTHONPATH=~/tt-forge:$PYTHONPATH
TT-Forge (isolation required):
unset TT_METAL_HOME
unset TT_METAL_VERSION
export TTFORGE_TOOLCHAIN_DIR=/home/$USER/ttforge-toolchain
export TTMLIR_TOOLCHAIN_DIR=/home/$USER/ttmlir-toolchain
export TTFORGE_PYTHON_VERSION=python3.11
export CC=/usr/bin/clang-17
export CXX=/usr/bin/clang++-17
🐛 Common Compatibility Issues
Issue 1: "undefined symbol: MPIX_Comm_revoke"
Error:
ImportError: /home/user/tt-metal/build/tt_metal/libtt_metal.so: undefined symbol: MPIX_Comm_revoke
Cause: OpenMPI library path not set
Fix:
export LD_LIBRARY_PATH=/opt/openmpi-v5.0.7-ulfm/lib:$LD_LIBRARY_PATH
Prevalence: #1 most common error in cloud environments
Make permanent:
echo 'export LD_LIBRARY_PATH=/opt/openmpi-v5.0.7-ulfm/lib:$LD_LIBRARY_PATH' >> ~/.bashrc
Issue 2: vLLM Version Mismatch
Error:
TypeError: block_size has unsupported type list[int]
Cause: PyTorch/vLLM type hint incompatibility
Fix: Use Docker image (validated configuration)
# See Lesson 6 for tt-inference-server Docker setup
# See Lesson 7 for manual Docker approach
Alternative: Match specific tt-metal and vLLM commits via model_specs_output.json (advanced)
Issue 3: TT-Forge Import Failure
Error:
ImportError: /path/to/libTTMLIRRuntime.so: undefined symbol: _ZN4ttnn...
Cause: Environment variable pollution (TT_METAL_HOME conflicts)
Fix:
source ~/tt-scratchpad/setup-tt-forge.sh
The script automatically unsets conflicting variables.
Issue 4: CMake Build Errors (TT-Forge)
Error:
CMake Error: CMAKE_C_COMPILER: clang not found
Cause: Compiler symlinks not created
Fix:
sudo update-alternatives --install /usr/bin/clang clang /usr/bin/clang-17 100
sudo update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-17 100
Issue 5: JAX 0.7.1 Not Found
Error:
ERROR: Could not find a version that satisfies the requirement jax==0.7.1
ERROR: Ignored the following versions that require a different python version: ... Requires-Python >=3.11
Cause: Python version mismatch (needs 3.11)
Fix:
# Install Python 3.11
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install -y python3.11 python3.11-venv python3.11-dev
# Create venv with 3.11
python3.11 -m venv ~/tt-xla-venv
source ~/tt-xla-venv/bin/activate
📊 Model Size vs Hardware Matrix
Recommended Model-Hardware Combinations
| Model | Parameters | Disk Size | N150 (12GB) | N300 (24GB) | T3K (96GB) |
|---|---|---|---|---|---|
| Qwen3-0.6B | 0.6B | 1.5GB | ✅ Perfect | ✅ Excellent | ✅ Excellent |
| Gemma 3-1B-IT | 1B | 2GB | ✅ Good | ✅ Excellent | ✅ Excellent |
| Llama-3.1-8B | 8B | 16GB | ⚠️ Tight | ✅ Good | ✅ Excellent |
| Qwen3-8B | 8B | 16GB | ⚠️ Tight | ✅ Good | ✅ Excellent |
| Llama-3.1-70B | 70B | 140GB | ❌ Too large | ❌ Too large | ✅ Good |
Legend:
- ✅ Perfect: Fast, reliable, recommended
- ✅ Good: Works well, stable
- ⚠️ Tight: May work but can exhaust DRAM under load
- ❌ Too large: Model won't fit
N150 recommendation: Start with Qwen3-0.6B (0.6B parameters, 1.5GB)
- Ultra-lightweight (13x smaller than Llama-3.1-8B)
- No HuggingFace token needed (ungated)
- Dual thinking modes (reasoning-capable)
- Perfect for learning and many production use cases
🎓 Learning Path Recommendations
Path 1: Beginner (First Time with Tenstorrent)
- Hardware: N150
- Start with: Lessons 1-5 (Direct tt-metal API)
- Model: Qwen3-0.6B
- Time to first inference: ~30 minutes
- Environment: Standard tt-metal (Python 3.10)
Path 2: Production Deployment
- Hardware: N150/N300/T3K depending on model size
- Start with: Lessons 1-5 (understand the stack)
- Then: Lesson 6 (tt-inference-server Docker)
- Model: Match to hardware capacity
- Environment: Docker (validated configurations)
Path 3: Model Developer
- Hardware: N150 (development), scale up for testing
- Start with: Lessons 1-5 (foundation)
- Then: Lesson 13 (Bounty Program contribution workflow)
- Model: Bring your own architecture
- Environment: Standard tt-metal + git workflow
Path 4: Compiler Explorer
- Hardware: N150 (single-chip)
- Start with: Lessons 1-5 (baseline understanding)
- Then: Lesson 12 (TT-XLA, production-ready)
- Optional: Lesson 11 (TT-Forge, experimental)
- Environment: Isolated (separate Python 3.11 venvs)
🔍 Validation Status by Lesson
| Lesson | Hardware Tested | Status | Notes |
|---|---|---|---|
| 1-5 | N150 | ✅ Validated | Zero issues after install_dependencies.sh |
| 6 | N150 | ✅ Validated | tt-inference-server Docker |
| 7 | N150 | ⚠️ Docker recommended | Native install has version challenges |
| 8 | N150 | ✅ Validated | VSCode chat integration |
| 9 | N150 | ✅ Validated | Stable Diffusion 3.5, ~2.5 min generation |
| 10 | N150 | ✅ Validated | Coding assistant |
| 11 | N150 | ⚠️ Experimental | TT-Forge 45-60 min build, limited model support |
| 12 | N150 | ✅ Validated | TT-XLA wheel install, GPT-2 XL working |
| 13-14 | N150 | 📋 Documentation | Bounty program, RISC-V programming |
| 15 | N150 | ✅ Validated | TT-Metalium cookbook projects |
📚 Additional Resources
Official Documentation:
- tt-metal: https://github.com/tenstorrent/tt-metal
- tt-inference-server: https://github.com/tenstorrent/tt-inference-server
- TT-XLA: https://github.com/tenstorrent/tt-xla
- TT-Forge: https://github.com/tenstorrent/tt-forge-fe
Community:
- Discord: https://discord.gg/tenstorrent
- Model contributions: Lesson 13 (Bounty Program)
Troubleshooting:
- FAQ page (in this extension)
- Step Zero guide (comprehensive tech stack explanation)
- Lesson-specific debugging sections
Remember: When in doubt, start with the recommended path for your hardware. The most reliable configurations are thoroughly documented in Lessons 1-5, which work on all hardware.