TT-Metalium
Get Started
Getting Started
Install
TT-Metalium
Programming Model
APIs
Programming Examples
DRAM Loopback
Eltwise SFPU
Eltwise binary
Matmul (Single Core)
Matmul (Multi Core)
Matmul (Multi Core Optimized)
Tools
Resources
Support
Contributing as a developer
TT-Metalium
Programming Examples
View page source
Programming Examples
DRAM Loopback
Silicon accelerator setup
Program pre-compilation setup
Building a data movement kernel
Create buffers in DRAM and L1
Sending real data into DRAM
Setting runtime arguments for the data movement kernel
Running the program
Launch and verify output
Validation and teardown
Eltwise SFPU
Circular buffers for data movement to/from compute engine
Compile-time compute kernel arguments
Compute kernel declaration and compile-time defines
Extra runtime arguments for reader/writer
Conclusion
Eltwise binary
New buffers
Compute kernel declaration and compile-time defines
Extra source tensor
Conclusion
Matmul (Single Core)
Host Code
Main blocks in matmul_single_core function
Create Program, Enqueue initialization, and core range definition
Create DRAM buffers & Circular buffers
Compile-time kernels arguments
Compute kernel declaration and compile-time defines
Runtime arguments and program launch
Conclusion
Matmul (Multi Core)
Accessing all the cores
Splitting the work across cores
Using different kernels for reader/writer
Compute kernel args
Reader/writer kernel runtime args
Conclusion
Matmul (Multi Core Optimized)
Data Reuse in
matmul_multicore_reuse
Fine-Grained Block Size Control
Intermediate Circular Buffer Configuration
Stride Kernel Arguments
Intermediate Results Handling
Conclusion
Data Multicasting in
matmul_multicore_reuse_mcast
Additional Compile-Time Argument
Configuring Core Ranges for Tile Distribution
Circular Buffer Creation for CoreGrid
Multicast Reader/Writer Kernel Setup
New Compute Kernel: Fused Bias Addition and Activation Functions
Semaphores
Kernel Runtime Arguments