# tt_pcie_link_alive [<< Home](../index.md) | [<< Metrics](../metrics.md) ## Name ### Prometheus Metric Name ``` tt_pcie_link_alive ``` ### Metric Path (tt-telemetry) Schema: ``` {hostname}/tray{tray}/chip{chip}/pcie/tt_pcie_link_alive ``` Example path: ``` bh-glx-c09u02/tray1/chip2/pcie/tt_pcie_link_alive ``` ## Description Indicates whether the chip's PCIe link responds to a host read. Each collection cycle the telemetry server asks UMD's hang detector to read a BAR register that the chip is guaranteed to never legitimately hold as `0xFFFFFFFF`. If the read returns `0xFFFFFFFF`, the PCIe link has silently dropped and any subsequent reads will also return the all-ones fault signature — recovery usually requires a board reset. This metric is only created for MMIO-capable chips on Wormhole and Blackhole architectures, since those are the devices for which UMD provides a PCIe hang detector. Remote chips and other architectures are skipped (no metric is emitted). ## Values **Type:** Boolean **Units:** None **Allowable values:** - **True (1)**: The chip responded normally to the PCIe probe read. - **False (0)**: The chip returned the `0xFFFFFFFF` fault signature; the PCIe link is hung. ## Prometheus Labels |Label Name|Value| |---|---| |hostname|The host from which the metric was collected.| |hall|The datacenter hall where the host is located. Sourced from the Factory System Descriptor (FSD).| |aisle|The datacenter aisle where the host is located. Sourced from the Factory System Descriptor (FSD).| |rack|The rack number where the host is located. Sourced from the Factory System Descriptor (FSD).| |shelf_u|The shelf U position in the rack where the host is located. Sourced from the Factory System Descriptor (FSD).| |tray|The tray (UBB) that the device is located on.| |chip|The ASIC location within the tray.|