TT-Metalium
Get Started
Getting Started
Install
TT-Metalium
Programming Model
APIs
Programming Examples
DRAM Loopback
Eltwise binary
Eltwise SFPU
Matmul (Single Core)
Matmul (Multi Core)
Matmul (Multi Core Optimized)
Tools
Resources
Support
Contributing as a developer
TT-Metalium
Programming Examples
View page source
Programming Examples
DRAM Loopback
Device initialization
Program setup
Create buffers in DRAM and L1 (SRAM)
Sending real data into DRAM
Creating a data movement kernel
Setting runtime arguments for the data movement kernel
Running the program
Download the result and verify output
Validation and teardown
Eltwise binary
Program setup
Circular buffers
Data movement and compute kernels
Download the result and verify output
Validation and teardown
Eltwise SFPU
Program setup
The kernels
Set up runtime arguments
Program execution and final check
Conclusion
Matmul (Single Core)
Device Initialization & Program Setup
Data Preparation and Golden Reference
DRAM Buffer Allocation
Circular Buffer Orchestration for Pipelined MatMul
Matmul Kernel Pipeline Breakdown
The reader kernel
The compute kernel
The writer kernel
Kernel exexution and result verification
Conclusion
Matmul (Multi Core)
Accessing all the cores
Splitting the work across cores
Using different kernels for reader/writer
Compute kernel args
Reader/writer kernel runtime args
Conclusion
Matmul (Multi Core Optimized)
Data Reuse in
matmul_multicore_reuse
Fine-Grained Block Size Control
Intermediate Circular Buffer Configuration
Stride Kernel Arguments
Intermediate Results Handling
Conclusion
Data Multicasting in
matmul_multicore_reuse_mcast
Additional Compile-Time Argument
Configuring Core Ranges for Tile Distribution
Circular Buffer Creation for CoreGrid
Multicast Reader/Writer Kernel Setup
New Compute Kernel: Fused Bias Addition and Activation Functions
Semaphores
Kernel Runtime Arguments
Version:
latest
Versions