Logo TT-NN

TTNN

  • What is TT-NN?
  • Getting Started
  • Install
  • Using TT-NN
  • Tensor
  • APIs
  • Tutorials
    • Tensor and Add Operation
    • Matmul Operation
    • Multi-Head Attention
      • Multi-Head Attention
    • ttnn Tracer
    • ttnn Profiling
    • Resnet Basic Block
    • Graphing Torch DiT_XL_2 With TTNN
  • Onboarding New Functionality
  • Converting PyTorch Model to TT-NN
  • Adding New TT-NN Operation
  • Profiling TT-NN Operations
  • Building and Uplifting Demos

Models

  • Getting Started
  • Performance

Resources

  • Support
  • Contributing as a developer
TT-NN
  • Tutorials
  • Multi-Head Attention
  • View page source

Multi-Head Attention

  • Multi-Head Attention
    • Enable program cache
    • Write Multi-Head Attention using ttnn
    • Configuration
    • Initialize activations and weights using torch
    • Convert activations and weights to ttnn
    • Run the first iteration of Multi-Head Attention
    • Run a subsequent iteration of Multi-Head Attention
    • Write optimized version of Multi-Head Attention
    • Pre-process the parameters of the optimized model
    • Run the first iteration of the optimized Multi-Head Attention
    • Run a subsequent iteration of the optimized Multi-Head Attention
    • Check that the output of the optimized version matches the output of the original implementation
    • Close the device
Previous Next

© Copyright Tenstorrent.

Built with Sphinx using a theme provided by Read the Docs.
Version: latest
Versions