| 指标英文名 | 指标中文名 | 指标描述 | 单位 | 维度 | 统计粒度 |
| CfsClientDataReadBandwidth | turocfs 单节点服务端读带宽 | 实例维度-turocfs 单节点服务端读带宽 | KBytes/s | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| CfsClientDataWriteBandwidth | turocfs 单节点服务端写带宽 | 实例维度-turocfs 单节点服务端写带宽 | KBytes/s | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| CfsDataReadIoBytes | cfs 服务端读带宽 | 实例维度-cfs 服务端读带宽 | KBytes/s | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| CfsDataReadIoLatency | cfs 读延迟 | 实例维度-cfs 读延迟 | ms | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| CfsDataWriteIoBytes | cfs 服务端写带宽 | 实例维度-cfs 服务端写带宽 | KBytes/s | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| CfsDataWriteIoLatency | cfs 写延迟 | 实例维度-cfs 写延迟 | ms | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| CfsStrageUsageGb | cfs 存储数据容量 | 实例维度-cfs 存储数据容量 | GBytes | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| DiskIoUtil | 磁盘 ioutil | 实例维度-磁盘 ioutil | % | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| DiskIoWait | 磁盘 iowait | 实例维度-磁盘 iowait | % | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| DiskReadByte | 磁盘读取带宽 | 实例维度-磁盘读取带宽 | MBytes/s | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| DiskReadIops | 磁盘读取 iops | 实例维度-磁盘读取 iops | Count | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| DiskUsageRadio | 系统盘分区利用率 | 实例维度-系统盘分区利用率 | % | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| DiskWriteByte | 磁盘写入带宽 | 实例维度-磁盘写入带宽 | MBytes/s | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| DiskWriteIops | 磁盘写入 iops | 实例维度-磁盘写入 iops | Count | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| Instancecpuutil | CPU 利用率 | 实例维度-CPU 利用率 | % | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| Instancegpumemutil | GPU 显存利用率 | 实例维度-GPU 显存利用率 | % | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| Instancegpumemvalue | 显存使用量 | 实例维度-显存使用量 | MBytes | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| Instancegpuutil | GPU 利用率 | 实例维度-GPU 利用率 | % | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| Instancememutil | 内存利用率 | 实例维度-内存利用率 | % | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| Instancememvalue | 内存使用量 | 实例维度-内存使用量 | MBytes | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuFp16EngineActivity | FP16活跃时间比 | 实例GPU卡维度-FP16活跃时间比 | % | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuFp32EngineActivity | FP32活跃时间比 | 实例GPU卡维度-FP32活跃时间比 | % | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuFp64EngineActivity | FP64活跃时间比 | 实例GPU卡维度-FP64活跃时间比 | % | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| NvlinkBandwidth | nvlink 传输速率 | 实例GPU卡维度-nvlink 传输速率 | Bytes/s | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| PcieBandwidth | PCIe 总线传输速率 | 实例GPU卡维度-PCIe 总线传输速率 | Bytes/s | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuSmActivity | SM 活跃状态时间比 | 实例GPU卡维度-SM 活跃状态时间比 | % | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| TensorActivity | Tensor 活跃状态时间比 | 实例GPU卡维度-Tensor 活跃状态时间比 | % | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| Dcgmfidevfbused | 显存使用量 | 实例GPU卡维度-显存使用量 | MBytes | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| DcgmFiDevGpuUtil | GPU 使用率 | 实例GPU卡维度-GPU 使用率 | % | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| DcgmFiDevMemCopyUtil | 显存使用率 | 实例GPU卡维度-显存使用率 | % | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuMemoryClockGpu | GPU 显存频率 | 实例GPU卡维度-GPU 显存频率 | s | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuMemoryFreeGpuv | GPU 显存空闲量 | 实例GPU卡维度-GPU 显存空闲量 | MBytes | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuNvlinkRxMb | nvlink 接收数据量 | 实例GPU卡维度-nvlink 接收数据量 | Mbps | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuNvlinkTxMb | nvlink 发送数据量 | 实例GPU卡维度-nvlink 发送数据量 | Mbps | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuPcieRxMb | pcie 接收数据量 | 实例GPU卡维度-pcie 接收数据量 | Mbps | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuPcieTxMb | pcie 发送数据量 | 实例GPU卡维度-pcie 发送数据量 | Mbps | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| GpuSmClock | SM 时钟频率 | 实例GPU卡维度-SM 时钟频率 | s | AppId InstanceGpuNum SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| PodDiskLimit | 实例磁盘总量 | 实例维度-实例磁盘总量 | GBytes | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| PodDiskValue | 实例磁盘使用量 | 实例维度-实例磁盘使用量 | GBytes | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| NodeDiskLimit | 节点磁盘总量 | 实例维度-节点磁盘总量 | GBytes | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| NodeDiskValue | 节点磁盘使用量 | 实例维度-节点磁盘使用量 | GBytes | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| RdmaInpkt | RDMA 网卡入包量 | 实例维度-RDMA 网卡入包量 | pps | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| RdmaOutpkt | RDMA 网卡出包量 | 实例维度-RDMA 网卡出包量 | pps | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| RdmaIntraffic | RDMA 网卡接收带宽 | 实例维度-RDMA 网卡接收带宽 | Mbps | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
| RdmaOuttraffic | RDMA 网卡发送带宽 | 实例维度-RDMA 网卡发送带宽 | Mbps | AppId InstanceId SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |