tt_eth_firmware_signature
Name
Prometheus Metric Name
tt_eth_firmware_signature
Metric Path (tt-telemetry)
Schema:
{hostname}/tray{tray}/chip{chip}/channel{channel}/tt_eth_firmware_signature
Example path:
bh-glx-c09u02/tray1/chip2/channel0/tt_eth_firmware_signature
Description
The 16-bit ERISC firmware signature for a specific Ethernet channel, read directly from the heartbeat word in the Ethernet core’s L1 memory.
The heartbeat word is a 32-bit value: the upper 16 bits carry a signature identifying which firmware is running on the core, and the lower 16 bits are a counter the firmware increments while alive. This metric reports the signature half only (already shifted into the low 16 bits of the reported value). The companion metric tt_ethernet_heartbeat reports whether the counter half is advancing.
The heartbeat word is read from:
Wormhole B0:
ETH_HEARTBEAT_ADDR(0x1C) in the Ethernet core’s L1.Blackhole: the
heartbeat[0]field of theeth_status_tstruct insideBOOT_RESULTS_ADDR(0x7CC70).
The layout is identical on both architectures.
This metric is collected unconditionally on every Ethernet core; no ARC telemetry or topology information is required to populate it.
Observed values
Signature |
Meaning |
|---|---|
|
Base ERISC firmware is running on the core. |
|
AERISC software fabric has taken over the core. See |
Other |
A firmware variant not enumerated here, an uninitialized/corrupt heartbeat word, or a core that is not running ERISC firmware at all. |
The constants are mirrored in src/include/telemetry/ethernet/ethernet_helpers.hpp (AERISC_FABRIC_HEARTBEAT_SIGNATURE) and in the fabric-view frontend at src/frontend/static/js/fabric/constants.js. Note that the UMD-defined tt::umd::erisc_firmware::FABRIC_HEARTBEAT_SIGNATURE (0xAABB) does not match what is actually written to L1 in practice — the value to compare against is 0xDCBA.
Values
Type: Unsigned Integer
Units: None
Allowable values: A 16-bit value (0..65535) carrying the firmware signature. See the table above for known signatures.
Prometheus Labels
Label Name |
Value |
|---|---|
hostname |
The host from which the metric was collected. |
hall |
The datacenter hall where the host is located. Sourced from the Factory System Descriptor (FSD). |
aisle |
The datacenter aisle where the host is located. Sourced from the Factory System Descriptor (FSD). |
rack |
The rack number where the host is located. Sourced from the Factory System Descriptor (FSD). |
shelf_u |
The shelf U position in the rack where the host is located. Sourced from the Factory System Descriptor (FSD). |
tray |
The tray (UBB) that the device is located on. |
chip |
The ASIC location within the tray. |
channel |
The Ethernet channel number on the chip. |
port_type |
The physical port type (e.g., QSFP, backplane). Present when topology information is available. |
port_id |
The physical port ID. Present when topology information is available. |
remote_hostname |
The hostname of the remote endpoint. Present when remote endpoint information is available. |
remote_tray |
The tray number of the remote endpoint. Present when remote endpoint information is available. |
remote_chip |
The ASIC location of the remote endpoint. Present when remote endpoint information is available. |
remote_channel |
The Ethernet channel of the remote endpoint. Present when remote endpoint information is available. |
remote_hall |
The datacenter hall of the remote endpoint. Present when remote endpoint information is available. |
remote_aisle |
The datacenter aisle of the remote endpoint. Present when remote endpoint information is available. |
remote_rack |
The rack number of the remote endpoint. Present when remote endpoint information is available. |
remote_shelf_u |
The shelf U position of the remote endpoint. Present when remote endpoint information is available. |