MOKSHA-2026-0061: I/O Scheduling Downgrade to Idle Class via VBD.qos_algorithm_params

Advisory IDMOKSHA-2026-0061
Semantic IDBQP-3
Published2026-04-24
CVSS 3.15.3 Medium
CVSS 3.1 VectorAV:N/AC:H/PR:L/UI:N/S:U/C:N/I:H/A:N
CVSS 4.05.3 Medium
CVSS 4.0 VectorAV:N/AC:L/AT:N/PR:L/UI:N/VC:N/VI:L/VA:N/SC:N/SI:N/SA:N
XAPI ObjectVBD
XAPI Fieldqos_algorithm_params:sched
Entry Rolevm-admin
ResearcherJakob Wolffhechel, Moksha

Affected Products

VendorProductVersions
Citrix / Cloud Software GroupXenServer / Citrix Hypervisorall versions (shared XAPI codebase)
VatesXCP-ng8.3.0

Summary

A vm-admin in XAPI-based hypervisors (XenServer, XCP-ng) can downgrade a target VM's I/O scheduling class to idle by setting sched=idle in VBD.qos_algorithm_params. The idle scheduling class causes the Linux kernel to service the VBD's kernel threads only when no other I/O is pending on the host. Under any host I/O load, the target VM experiences severe I/O starvation - disk operations become effectively unresponsive. The VBD.qos_algorithm_params field has zero map_keys_roles entries, and QoS changes take effect immediately via hot-apply without VBD replug. This is a targeted denial-of-service that requires only vm-admin access to the target VM's VBD.

Vulnerability Description

VBD.qos_algorithm_params is a Map(String, String) field writable by vm-admin with zero per-key RBAC. When VBD.qos_algorithm_type is set to ionice, the sched key determines the I/O scheduling class for VBD kernel threads.

The code path:

  1. vm-admin identifies target VBD UUIDs via xe vbd-list
  2. Sets VBD.qos_algorithm_type to ionice and qos_algorithm_params:sched to idle
  3. XAPI parses the sched key at xapi_xenops.ml:604-617:
    | "idle" -> Idle
    
  4. Ionice.to_class_param at ionice.ml:28 maps Idle to class 3:
    | Idle -> (3, to_param Lowest)
    
  5. xenopsd invokes ionice -c3 -n7 -p<kthread_pid> via execve
  6. The kernel sets the VBD kernel threads to idle scheduling class

The idle scheduling class (class 3) is the lowest priority in the Linux I/O scheduler. Processes in this class are served only when no other I/O requests from any class (real-time or best-effort) are pending. On a host with any concurrent I/O activity - which is the norm in multi-VM environments - the target VM's disk I/O stalls indefinitely.

QoS changes are applied hot via the Needs_set_qos action request mechanism (xenops_server_xen.ml:4003-4017). When the current ionice settings differ from the target, xenopsd re-invokes ionice on running kernel threads without requiring VBD replug or VM reboot.

Root Causes

  1. Missing RBAC protection. VBD.qos_algorithm_params has zero map_keys_roles entries. The sched key inherits the class default _R_VM_ADMIN, allowing any vm-admin to set any scheduling class.

  2. No validation of scheduling class appropriateness. XAPI accepts idle as a valid scheduling class without checking whether it is appropriate for the target VM. A vm-admin managing multiple VMs can set a destructive scheduling class on VMs they do not own.

  3. Hot apply without authorization check. QoS changes take effect immediately on running VBDs. There is no confirmation step, no secondary authorization check, and no rate limiting on scheduling class changes.

  4. Insufficient logging. Scheduling class changes produce only debug-level log messages. No security alert is generated when a VBD's scheduling class is changed to idle.

Affected Systems

Directly Affected

Indirectly Affected

Exploitation Scenarios

Scenario Impact Pre-conditions Status
Targeted VM I/O starvation Target VM disk becomes unresponsive under any host I/O load vm-admin access to target VBD Confirmed (live host)
Multi-VBD downgrade Set idle class on all VBDs of a target VM for complete I/O denial vm-admin, multiple VBDs on target VM Source-traced
Storage timeout masquerading I/O starvation causes storage protocol timeouts that appear as infrastructure issues vm-admin, shared storage (iSCSI/NFS) Source-traced
BOC-1 chain Root access via BOC-1 S3 enables setting idle class on all VBDs across the pool, causing pool-wide I/O degradation vm-admin, BOC-1 Source-traced

Chaining Analysis

Detection

Remediation

Short-Term Mitigations

Long-Term Fix

Add map_keys_roles protection. Restrict the sched key to _R_POOL_OP or _R_POOL_ADMIN. I/O scheduling class selection is a host-level resource allocation decision that should not be delegated to tenant administrators.

Restrict idle class. Either remove the idle class option entirely from the parser, or require an elevated role (_R_POOL_ADMIN) to set it. There is no legitimate use case for a delegated administrator to downgrade another VM's I/O scheduling to idle.

Add security-level logging. Log all scheduling class changes at security level, not debug level. Include the session, role, source VBD, and target scheduling class.

Upstream patches exist. They are held privately pending coordinated disclosure.

Disclosure

Disclosure:

References

Credits

Discovered and reported by Jakob Wolffhechel, Moksha.

Jakob Wolffhechel · Moksha · Copenhagen
jakob@wolffhechel.dk · +45 3170 7337
Published 2026-04-24 08:00 CEST · cna.moksha.dk · shittrix.moksha.dk