A pool-operator can manipulate the High Availability timeout by setting Pool.other_config:default_ha_timeout to an arbitrary integer. The value is read by xapi_ha.ml:278-279 via int_of_string with no range check. Setting the timeout to 1 second causes spurious HA fencing events - hosts are incorrectly marked as dead, triggering cascading false fencing across the pool (split-brain condition). Setting the timeout to 999999 seconds effectively disables HA - actual host failures are not detected for days, leaving HA-protected VMs without failover protection. Both outcomes affect every HA-protected VM across the entire pool.
Pool.other_config is the highest-scope other_config field in the XAPI data model. The default_ha_timeout key overrides the default HA heartbeat timeout used to determine whether a host is alive or dead.
pool-operator calls Pool.add_to_other_config(pool, "default_ha_timeout", "1")
-> xapi_ha.ml:278-279 reads default_ha_timeout via int_of_string
-> No range validation performed
-> HA subsystem uses 1-second timeout for heartbeat monitoring
-> Normal network latency exceeds 1 second -> all hosts marked dead
-> Cascading fencing events: hosts reboot each other (split-brain)
pool-operator calls Pool.add_to_other_config(pool, "default_ha_timeout", "999999")
-> HA subsystem uses ~11.5-day timeout
-> Host failures not detected for days
-> HA-protected VMs not restarted after actual host failure
Mode 1 - Split-brain (timeout too low): Setting default_ha_timeout=1 causes the HA daemon to declare hosts dead after 1 second of missed heartbeats. Normal network jitter exceeds this threshold, triggering false fencing events. Multiple hosts simultaneously fence each other, causing a cascading reboot loop. All HA-protected VMs restart repeatedly.
Mode 2 - HA blindness (timeout too high): Setting default_ha_timeout=999999 makes HA unable to detect actual host failures for approximately 11.5 days. During this window, a failed host's HA-protected VMs are not restarted on surviving hosts.
Missing RBAC protection. Pool.other_config has zero map_keys_roles entries for infrastructure keys. The default_ha_timeout key is writable by pool-operator.
No range validation. xapi_ha.ml uses int_of_string with no bounds check. Any integer value is accepted, including values that make HA non-functional.
Pool-wide blast radius. The HA timeout applies to the entire pool. A single key write affects the failure detection behavior for every host and HA-protected VM.
Immediate effect. The changed timeout is read on the next HA monitoring cycle, with no confirmation or cooldown period.
| Scenario | Impact | Pre-conditions | Status |
|---|---|---|---|
| Split-brain (timeout=1) | Cascading fencing, all hosts reboot repeatedly | HA enabled | Modeled (code-traced: int_of_string with no range check at xapi_ha.ml:278-279) |
| HA blindness (timeout=999999) | Host failures undetected for days | HA enabled | Modeled (code-traced) |
| Storage corruption on split-brain | Concurrent access violations on shared storage during fencing | HA + shared storage | Modeled |
| BOC-1 chain | vm-admin escalates to pool-operator via BOC-1 S3, then manipulates HA timeout | BOC-1 available | Modeled (two-step chain) |
Pool.other_config for writes to default_ha_timeoutdisclosure/vendor-detection-guidance.mdPool.other_config for unexpected default_ha_timeout valuespool-operator role to trusted administratorsRBAC restriction. Add map_keys_roles entry for default_ha_timeout in datamodel.ml requiring _R_POOL_ADMIN.
Range validation. Validate default_ha_timeout at write time. Enforce a reasonable range (e.g., 10-600 seconds) and reject values outside it.
Write-time type checking. Validate that the value is a valid integer at write time, not just at read time.
Upstream patches exist. They are held privately pending coordinated disclosure.
Disclosure:
xapi_ha.ml:278-279 (default_ha_timeout read via int_of_string), datamodel.ml (Pool field definition)disclosure/advisories/ploc-security-advisory.md (PLOC-2)research/investigations/pool-other-config.mdDiscovered and reported by Jakob Wolffhechel, Moksha.