-
Epic
-
Resolution: Done
-
Medium
-
None
-
None
-
[INF-B-11]Add the host management
- Full life-cycle and availability management of the physical hosts
- Detects and automatically handles host failures and initiates recovery
- Monitoring and fault reporting for:
- Cluster connectivity
- Critical process failures
- Resource utilization thresholds, interface states
- H/W fault / sensors, host watchdog
- Activity progress reporting
- Interfaces with board management (BMC)
- For out of band reset
- Power-on/off
- H/W sensor monitoring