A pool-operator can cause complete data loss for all VDIs on an LVHD storage repository by removing the use_vhd key from SR.sm_config. This key tells the LVHDSR driver whether VDIs are stored in VHD format or raw LV format. Removing it on a VHD-mode SR causes the driver to operate in legacy raw LV mode on the next PBD plug cycle. VHD-format VDIs read as raw LVs produce garbage data. Raw LVs read as VHD fail header validation. The result is complete data loss for every VDI on the affected SR.
SR.sm_config is a Map(String, String) field writable by pool-operator with zero map_keys_roles entries. The use_vhd key is set by the LVHDSR driver during SR.create and is consumed during LVHDSR.load() to determine the storage format for all VDIs on the SR.
pool-operator calls SR.remove_from_sm_config(sr, "use_vhd")
-> sm_config updated in XAPI database (key removed)
-> Next PBD plug (reboot, HA failover, manual unplug/plug):
-> LVHDSR.load() reads sm_config
-> use_vhd key missing -> driver falls back to raw LV mode
-> VHD-format VDIs accessed as raw LVs
-> VHD headers misinterpreted as data -> garbage output
-> Raw-mode writes corrupt VHD metadata -> unrecoverable
The corruption does not take effect immediately. The LVHDSR driver reads sm_config during LVHDSR.load(), which occurs at PBD plug time - not continuously. The corrupted state lies dormant until the next PBD cycle: host reboot, HA failover, storage maintenance, or manual PBD unplug/plug. In production, this dormancy period can span days to months.
Once the driver operates in raw LV mode on VHD-format VDIs:
use_vhd=true after writes have occurred does not repair the corrupted VHD headersMissing key immutability. SR.sm_config keys written by the SM driver during SR.create remain user-writable at runtime. No mechanism enforces immutability of driver-set configuration.
Zero map_keys_roles entries. The use_vhd key has no per-key RBAC protection. Any pool-operator can remove it.
Missing write-time validation. No check prevents removal of use_vhd on a VHD-mode SR. The key should be either immutable or validated against the actual SR format.
Silent fallback. The driver silently falls back to raw LV mode when use_vhd is missing, rather than failing safely with an error.
use_vhd controls VHD vs. raw mode| Scenario | Impact | Pre-conditions | Status |
|---|---|---|---|
| VHD format flag removal | Complete data loss for all SR VDIs on next PBD cycle | VHD-mode LVHD SR | Code-traced (LVHDSR.py format switch path confirmed) |
| Delayed detonation | Corruption dormant until reboot/failover | Standard production operation | Code-traced (load-time-only read confirmed) |
| BOC-1 chain | vm-admin escalates to pool-operator via BOC-1 S3, then removes use_vhd | BOC-1 available | Modeled (two-step chain) |
SR.sm_config for removal of use_vhd key on LVHD SRsSR.sm_config snapshots between scans for drift in driver-written keysSR.sm_config after SR creationdisclosure/vendor-detection-guidance.mdSR.sm_config records for expected use_vhd presence on VHD-mode SRsSR.sm_config values after SR creation and alert on changespool-operator role to trusted storage administratorsKey immutability. Enforce immutability for driver-written SR.sm_config keys. Once set by the SM driver during SR.create, keys should not be modifiable via the API.
Add map_keys_roles. Protect use_vhd at _R_LOCAL_ROOT_ONLY to prevent any API user from modifying it.
Validate use_vhd as boolean. Reject removal on VHD-mode SRs. The driver should refuse to load if use_vhd is absent on an SR that was created in VHD mode.
Fail-safe behavior. When use_vhd is missing, the driver should refuse to load rather than silently falling back to raw LV mode.
Upstream patches exist. They are held privately pending coordinated disclosure.
Disclosure:
LVHDSR.py:load() (use_vhd read), LVHDSR.py:create() (use_vhd set at creation)disclosure/advisories/ssmc-security-advisory.md (SSMC-2)research/investigations/sr-sm-config.mdDiscovered and reported by Jakob Wolffhechel, Moksha.