A snapshot cannot be deleted or mounted
Original Publishing Date:
2022-04-13
Symptoms
-
Ploop snapshot fails to be deleted with the following error message:
~# prlctl snapshot-delete 333 -i f941b9bd-bc78-4b8f-a846-c91ec61499d2
Delete the snapshot...
Failed to delete snapshot: Operation failed. Failed to delete snapshot: Error in ioctl(PLOOP_IOC_MERGE): Device or resource busy [3]
Failed to delete snapshot {f941b9bd-bc78-4b8f-a846-c91ec61499d2}
-
It is not possible to create a new backup of the affected ploop container:
~# vzabackup -F localhost -e 333
Starting backup operation for node 'pcs.container.org'...
* Operation with the Container pcs.container.org is started
* Preparing for backup operation
* Operation with the Container pcs.container.org is finished successfully.
Backup operation for node 'pcs.container.org' failed:
Backup failed
Cause
Previous backup of the container was not finished properly - it was terminated or it crashed.
As a result, container's image is mounted twice.
Resolution
-
Clarify who locks root.hds:
~# grep /333/ /sys/block/ploop*/pdelta/*/image
/sys/block/ploop32064/pdelta/0/image:/vz/private/333/root.hdd/root.hds
/sys/block/ploop32064/pdelta/1/image:/vz/private/333/root.hdd/root.hds.{f37e6b00-7fcb-49b8-8942-58179ba3900d}
/sys/block/ploop43803/pdelta/0/image:/vz/private/333/root.hdd/root.hds
So you can see that root.hds is mounted twice.
-
Check what is regular mount and what mount is related to terminated backup:
~# cat /proc/mounts | grep 333 | grep ploop
/dev/ploop32064p1 /vz/root/333 ext4 rw,relatime,barrier=1,data=ordered,balloon_ino=12,pfcache_csum,pfcache=/vz/pfcache,jqfmt=vfsv0,usrjquota=aquota.user,grpjquota=aquota.group 0 0
/dev/ploop43803p1 /vz/backup/333/tmpidyHKK/fs ext4 ro,relatime,barrier=1,data=ordered,balloon_ino=12,pfcache_csum 0 0
Most probably the second mount is related to a backup. You need to double check this assumption:
~# cat /sys/block/ploop43803/pstate/cookie
vzbackup
So it is obviously a backup-related mount that should be unmounted.
-
Confirm that the mount is not used by any process:
~# lsof 2> /dev/null | grep /vz/backup/333/tmpidyHKK/fs
~#
In case the mount is still in use by a backup process:
~# ps aux | grep vzlpl
root 429273 0.0 0.0 124800 7552 ? S 2013 0:00 /opt/pva/agent/bin/vzlpl /var/opt/pva/agent/tmp.JJbakK
root 520460 0.0 0.0 103256 848 pts/6 S+ 07:08 0:00 grep vzlpl
root 568003 0.0 0.0 123968 4168 ? S 2013 0:00 /opt/pva/agent/bin/vzlpl /var/opt/pva/agent/tmp.QxeOqd
Check if the process is actually operational and not stuck. For example, confirm that the temporary files in /var/opt/pva/agent/ are actually present. In case /var/opt/pva/agent/tmp.JJbakK and /var/opt/pva/agent/tmp.QxeOqd do not exist, kill both vzlpl processes and run lsof once again to confirm that the mount is not used.
-
Unmount the device:
~# umount /vz/backup/333/tmpidyHKK/fs
~# ploop umount -d /dev/ploop43803
Unmounting device /dev/ploop43803
Internal content
New backup creation should automatically unmount spare mounts, PVA-33402. But it still can fail to umount it properly and the operation might fail as the result.
Temporary snapshots can also be checked with:
# ploop list | grep "101" | grep VZABasicFunctionalityLocal
ploop35802 /pcs/101/root.hdd/root.hds VZABasicFunctionalityLocal
ploop44017 /pcs/101/root.hdd/root.hds VZABasicFunctionalityLocal
ploop24305 /pcs/101/root.hdd/root.hds VZABasicFunctionalityLocal