MOKSHA-2026-0074: GC and Coalesce Disablement via SR.other_config

Advisory IDMOKSHA-2026-0074
Semantic IDSOC-5
Published2026-04-24
CVSS 3.14.9 Medium
CVSS 3.1 VectorAV:N/AC:L/PR:H/UI:N/S:U/C:N/I:N/A:H
CVSS 4.06.9 Medium
CVSS 4.0 VectorAV:N/AC:L/AT:N/PR:H/UI:N/VC:N/VI:L/VA:H/SC:N/SI:N/SA:N
XAPI ObjectSR
XAPI Fieldother_config:gc, other_config:coalesce
Entry Rolepool-operator
ResearcherJakob Wolffhechel, Moksha

Affected Products

VendorProductVersions
Citrix / Cloud Software GroupXenServer / Citrix Hypervisorall versions (shared XAPI codebase)
VatesXCP-ng8.3.0

Summary

A pool-operator in XAPI-based hypervisors (XenServer, XCP-ng) can disable garbage collection and VHD coalescing on any Storage Repository by setting gc=false and/or coalesce=false in SR.other_config. The SM garbage collector reads these keys at cleanup.py:2052 and cleanup.py:2090 respectively. When disabled, orphan VDIs accumulate and consume storage space without reclamation, and VHD snapshot chains grow unbounded, degrading I/O performance and eventually causing chain-length errors. The SR.other_config field has no map_keys_roles entries for infrastructure keys.

Vulnerability Description

SR.other_config is a Map(String, String) field defined at datamodel.ml:4930-4935 with _R_POOL_OP as the minimum write role.

The SM garbage collector checks these keys during its scan cycle:

cleanup.py:2052:
  other_config.get(VDI.DB_GC) == "false"
  # When true, GC is disabled for this SR

cleanup.py:2090:
  other_config.get(VDI.DB_COALESCE)
  # When "false", coalesce is disabled for this SR

The gc key controls whether the garbage collector reclaims orphan VDIs (VDIs with no attached VBDs and no parent references). The coalesce key controls whether VHD chain coalescing runs - the process that merges child VHD images into their parent after snapshot deletion.

Both keys accept arbitrary string values with no validation. The only check is string comparison against "false". Setting either key to "false" silently disables the corresponding operation with no logging, no alert, and no expiration.

The effects are progressive and silent:

  1. GC disabled: Orphan VDIs accumulate. Each snapshot deletion creates an orphan that is never reclaimed. Storage consumption grows monotonically.

  2. Coalesce disabled: VHD chains grow with each snapshot cycle. I/O latency increases as each read must traverse more chain links. Eventually the VHD chain length reaches the kernel limit (typically 20-30 levels) and VDI operations fail.

Root Causes

  1. Missing RBAC protection. SR.other_config has map_keys_roles entries only for UI keys. The gc and coalesce keys are writable by any pool-operator.

  2. Silent disablement. Disabling GC or coalesce produces no alert, no log message at warning level, and no XAPI event. The operator has no indication that storage maintenance has stopped.

  3. No expiration or time-bound. Once set, the keys persist indefinitely. There is no mechanism to automatically re-enable GC/coalesce after a maintenance window.

  4. set_other_config RBAC bypass. The set_other_config method replaces the entire map atomically and bypasses map_keys_roles per-key checks.

Affected Systems

Directly Affected

Indirectly Affected

Exploitation Scenarios

Scenario Impact Pre-conditions Status
Storage exhaustion Orphan VDIs accumulate until SR is full pool-operator, set gc=false Source-traced
VHD chain length exceeded VDI operations fail when chain exceeds kernel limit pool-operator, set coalesce=false, active snapshot cycle Source-traced
I/O performance degradation Read latency increases linearly with chain depth pool-operator, set coalesce=false Source-traced
BOC-1 chain vm-admin disables GC/coalesce across all SRs via RBAC collapse vm-admin, BOC-1 Source-traced

Chaining Analysis

Detection

Remediation

Short-Term Mitigations

Long-Term Fix

Add map_keys_roles protection. Restrict gc and coalesce keys to _R_POOL_ADMIN in datamodel.ml.

Add alerting. Generate a XAPI alert when GC or coalesce is disabled, so operators are aware of the change.

Add time-bound disablement. If GC/coalesce disablement is a legitimate maintenance operation, add a timeout mechanism that automatically re-enables after a configurable duration.

Upstream patches exist. They are held privately pending coordinated disclosure.

Disclosure

Disclosure:

References

Credits

Discovered and reported by Jakob Wolffhechel, Moksha.

Jakob Wolffhechel · Moksha · Copenhagen
jakob@wolffhechel.dk · +45 3170 7337
Published 2026-04-24 08:00 CEST · cna.moksha.dk · shittrix.moksha.dk