tt_device_dram_used_megabytes
Name
Prometheus Metric Name
tt_device_dram_used_megabytes
Metric Path (tt-telemetry)
Schema:
{hostname}/tray{tray}/chip{chip}/dram/tt_device_dram_used_megabytes
Example path:
bh-glx-c09u02/tray1/chip2/dram/tt_device_dram_used_megabytes
Description
The amount of DRAM currently allocated on the chip by tt-metal, summed across every process attached to the chip, in mebibytes (1 MiB = 1024×1024 B), rounded to the nearest integer.
The underlying byte count is read from the total_dram_allocated field of tt-metal’s per-device shared-memory allocator-stats region (/dev/shm/tt_device_<asic_id>_memory). tt-telemetry maps the region read-only and never writes to it.
If no tt-metal process has ever touched the chip on this host the SHM file is absent and the metric reports 0; the reader retries on every cycle. If the SHM region exists but its layout version disagrees with what tt-telemetry was built against, the metric reports 0 and a warning is logged once. The layout contract is the version field of the SHM struct (currently 3).
For multi-chip mesh devices (e.g. N300, Galaxy) the shared-memory region currently aggregates allocations across the gateway chip and any remote chips reached through it, all reported under the gateway’s tray/chip labels. Per-chip breakdown for mesh devices is a planned follow-up.
Values
Type: Unsigned Integer
Units: Megabytes (MB) — reported as mebibytes (1 MiB = 1024×1024 B), rounded to the nearest integer.
Allowable values:
A non-negative integer. 0 means either no allocations are live or the SHM region is unavailable for this chip.
Prometheus Labels
Label Name |
Value |
|---|---|
hostname |
The host from which the metric was collected. |
hall |
The datacenter hall where the host is located. Sourced from the Factory System Descriptor (FSD). |
aisle |
The datacenter aisle where the host is located. Sourced from the Factory System Descriptor (FSD). |
rack |
The rack number where the host is located. Sourced from the Factory System Descriptor (FSD). |
shelf_u |
The shelf U position in the rack where the host is located. Sourced from the Factory System Descriptor (FSD). |
tray |
The tray (UBB) that the device is located on. |
chip |
The ASIC location within the tray. |
unit |
The unit of measurement. Always |