Redfish 硬件指标

`redfish` 硬件类型支持通过 通知系统 发送硬件指标。通知的 `event_type` 字段将被设置为 `hardware.redfish.metrics`(其中 `redfish` 可能会被派生自它的硬件类型驱动程序的不同名称替换)。

启用 Redfish 硬件指标需要对 ironic.conf 配置文件进行一些更新

[oslo_messaging_notifications]
# The Drivers(s) to handle sending notifications. Possible
# values are messaging, messagingv2, routing, log, test, noop,
# prometheus_exporter (multi valued)
# Example using the messagingv2 driver:
driver = messagingv2

[sensor_data]
send_sensor_data = true

[metrics]
backend = collector

可以在 oslo.messaging 文档 中找到完整的 `[oslo_messaging_notifications]` 配置选项列表

每个通知的有效负载是一个映射,其中键是传感器类型(`Fan`、`Temperature`、`Power` 或 `Drive`),值也是从传感器标识符到传感器数据的映射。

每个 `Fan` 有效负载包含以下字段

  • max_reading_range, min_reading_range - 读取值的范围。

  • reading, reading_units - 当前读数及其单位。

  • serial_number - 风扇传感器的序列号。

  • physical_context - 传感器的上下文,例如 `SystemBoard`。也可以是 `null` 或只是 `Fan`。

每个 `Temperature` 有效负载包含以下字段

  • max_reading_range_temp, min_reading_range_temp - 读取值的范围。

  • reading_celsius - 摄氏度的当前读数。

  • sensor_number - 温度传感器的编号。

  • physical_context - 传感器的上下文,通常反映其位置,例如 `CPU`、`Memory`、`Intake`、`PowerSupply` 或 `SystemBoard`。也可以是 `null`。

每个 `Power` 有效负载包含以下字段

  • power_capacity_watts, line_input_voltage, last_power_output_watts

  • serial_number - 电源的序列号。

  • state - 电源状态:`enabled`、`absent`(未知时为 `null`)。

  • health - 电源健康状态:`ok`、`warning`、`critical`(未知时为 `null`)。

每个 `Drive` 有效负载包含以下字段

  • name - 在 BMC 中的驱动器名称(不是像 `/dev/sda` 这样的 Linux 设备名称)。

  • model - 驱动器型号(如果已知)。

  • capacity_bytes - 驱动器容量(以字节为单位)。

  • state - 驱动器状态:`enabled`、`absent`(未知时为 `null`)。

  • health - 驱动器健康状态:`ok`、`warning`、`critical`(未知时为 `null`)。

注意

驱动器有效负载通常在实际硬件上不可用。

警告

指标收集通过轮询目标 BMC 上的几个 Redfish 端点工作。一些较旧的 BMC 实现可能具有严格的速率限制或在负载下表现不佳。如果对您来说是这种情况,您需要降低指标收集频率或完全禁用它。

示例(Dell)

{
    "message_id": "578628d2-9967-4d33-97ca-7e7c27a76abc",
    "publisher_id": "conductor-1.example.com",
    "event_type": "hardware.redfish.metrics",
    "priority": "INFO",
    "payload": {
        "message_id": "60653d54-87aa-43b8-a4ed-96d568dd4e96",
        "instance_uuid": null,
        "node_uuid": "aea161dc-2e96-4535-b003-ca70a4a7bb6d",
        "timestamp": "2023-10-22T15:50:26.841964",
        "node_name": "dell-430",
        "event_type": "hardware.redfish.metrics.update",
        "payload": {
            "Fan": {
                "0x17||Fan.Embedded.1A@System.Embedded.1": {
                    "identity": "0x17||Fan.Embedded.1A",
                    "max_reading_range": null,
                    "min_reading_range": 720,
                    "reading": 1680,
                    "reading_units": "RPM",
                    "serial_number": null,
                    "physical_context": "SystemBoard",
                    "state": "enabled",
                    "health": "ok"
                },
                "0x17||Fan.Embedded.2A@System.Embedded.1": {
                    "identity": "0x17||Fan.Embedded.2A",
                    "max_reading_range": null,
                    "min_reading_range": 720,
                    "reading": 3120,
                    "reading_units": "RPM",
                    "serial_number": null,
                    "physical_context": "SystemBoard",
                    "state": "enabled",
                    "health": "ok"
                },
                "0x17||Fan.Embedded.2B@System.Embedded.1": {
                    "identity": "0x17||Fan.Embedded.2B",
                    "max_reading_range": null,
                    "min_reading_range": 720,
                    "reading": 3000,
                    "reading_units": "RPM",
                    "serial_number": null,
                    "physical_context": "SystemBoard",
                    "state": "enabled",
                    "health": "ok"
                }
            },
            "Temperature": {
                "iDRAC.Embedded.1#SystemBoardInletTemp@System.Embedded.1": {
                    "identity": "iDRAC.Embedded.1#SystemBoardInletTemp",
                    "max_reading_range_temp": 47,
                    "min_reading_range_temp": -7,
                    "reading_celsius": 28,
                    "physical_context": "SystemBoard",
                    "sensor_number": 4,
                    "state": "enabled",
                    "health": "ok"
                },
                "iDRAC.Embedded.1#CPU1Temp@System.Embedded.1": {
                    "identity": "iDRAC.Embedded.1#CPU1Temp",
                    "max_reading_range_temp": 90,
                    "min_reading_range_temp": 3,
                    "reading_celsius": 63,
                    "physical_context": "CPU",
                    "sensor_number": 14,
                    "state": "enabled",
                    "health": "ok"
                }
            },
            "Power": {
                "PSU.Slot.1:Power@System.Embedded.1": {
                    "power_capacity_watts": null,
                    "line_input_voltage": 206,
                    "last_power_output_watts": null,
                    "serial_number": "CNLOD0075324D7",
                    "state": "enabled",
                    "health": "ok"
                },
                "PSU.Slot.2:Power@System.Embedded.1": {
                    "power_capacity_watts": null,
                    "line_input_voltage": null,
                    "last_power_output_watts": null,
                    "serial_number": "CNLOD0075324E5",
                    "state": null,
                    "health": "critical"
                }
            },
            "Drive": {
                "Solid State Disk 0:1:0:RAID.Integrated.1-1@System.Embedded.1": {
                    "name": "Solid State Disk 0:1:0",
                    "capacity_bytes": 479559942144,
                    "state": "enabled",
                    "health": "ok"
                },
                "Physical Disk 0:1:1:RAID.Integrated.1-1@System.Embedded.1": {
                    "name": "Physical Disk 0:1:1",
                    "capacity_bytes": 1799725514752,
                    "state": "enabled",
                    "health": "ok"
                },
                "Physical Disk 0:1:2:RAID.Integrated.1-1@System.Embedded.1": {
                    "name": "Physical Disk 0:1:2",
                    "capacity_bytes": 1799725514752,
                    "state": "enabled",
                    "health": "ok"
                },
                "Backplane 1 on Connector 0 of Integrated RAID Controller 1:RAID.Integrated.1-1@System.Embedded.1": {
                    "name": "Backplane 1 on Connector 0 of Integrated RAID Controller 1",
                    "capacity_bytes": null,
                    "state": "enabled",
                    "health": "ok"
                }
            }
        }
    },
    "timestamp": "2023-10-22 15:50:36.700458"
}