Performance

Prerequisites

Ensure that you have the base TT-Metalium source and environment configuration, and all the requirements for the models are installed. Follow these instructions.

Running a perf file

Each model ready for profiling comes with a perf file, typically found under models/YOUR_MODEL/perf_MODEL.py. To profile a model use

pytest tests/python_api_testing/models/YOUR_MODEL/tests/perf_YOUR_MODEL.py

Perf files will write the results in a csv file perf_YOUR_MODEL_date.csv. This file contains a table with two rows, headers and results

Model

Setting

Batch

First Run (sec)

Second Run (sec)

Compile Time (sec)

Inference Time GS (sec)

Throughput GS (batch*inf/sec)

Inference Time CPU (sec)

Throughput CPU (batch*inf/sec)

vit

base-patch16

1

30.51

16.05

14.46

16.05

0.0623

0.29

3.4960

  • First Run: Includes compilation time and inference time; without any caching enabled.

  • Second Run and Inference Time GS: Inference time of abovementioned model on Grayskull. It is referred to as Second Run since during the first run we cache the compile program and do not pay for the compilation at the second run.

  • Compile Time: Compile time as the name suggest, calculated by subtracting Second Run from the First Run.

  • Throughput GS Throughput of the model on Grayskull, computed as (batch*inf/sec) where inf is inference time on Grayskull.

  • Inference Time CPU Inference time of abovementioned model on CPU.

  • Throughput cpu Throughput of the model’s implementation of pytorch on CPU, computed as (batch*inf/sec) where inf inference time on CPU.

Running all the perf files

We also maintain tests/scripts/run_performance.sh to facilitate an easy way to profile all the models. Our attempt is to have the fastest commands for each perf file in this script. You can execute this file

./tests/scripts/run_performance.sh

This script will run all the perf files and merge the output csv files into on perf.csv file.