OLVM – Disk state stuck in finalizing state

 

Intro

It’s essential to have a proper backup mechanism for virtualization
infrastructure. Also, we need to test the backup and recovery method at least
once in 3 months to validate that everything is working as expected. Also
documenting the recovery procedure helps to avoid surprises when there is a
recovery scenario.  Organizations should be ready to address unexpected
failures at any time. 
For Oracle Linux Virtualization Manager (OLVM) 4+ environments you can use API
v4 for invoking all backup-related tasks. Import/export mode defines the way
the backups and restores are done. OLVM (with API v4) supports 3 modes:

1. Disk attachment :

which exports VM metadata (in OVF format) with separate disk files (in RAW
format) via Proxy VM with the Node installed.

  • Supports OLVM 4.0+
  • No incremental backup
  • Proxy VM required in each cluster – used for the disk attachment process

2. Disk image transfer : 

which exports VM metadata (in OVF format) with disk snapshot chains as
separate files (QCOW2 format):

  • Supports OLVM 4.2+/oVirt 4.2.3+
  • Supports incremental backup
  • Disk images are transferred directly from the API (no Proxy VM required)
3. SSH Transfer, this method assumes that all data transfers are
directly from the hypervisor over SSH protocol.

Below mentioned URL  helps you to filter all the backup tools that
support Oracle Linux Virtualization Manager.

Supported third-party backup tools :

https://apexapps.oracle.com/pls/apex/f?p=10263:17::::::

Figure 2: Third-party backup tools

In some cases, the disk image transfer network connection is
disturbed disk will be stuck in finalizing state.  

Note:
If the disks are finalizing state, you cannot put KVM into maintenance
mode.

                   
                     
Figure 2: Try to put KVM into maintenance mode.

                   
                
You can get a clear understanding of disk image transfer by referring to the
below-mentioned URL: https://storware.gitbook.io/backup-and-recovery/protecting-virtual-machines/virtual-machines/oracle-linux-virtualization-manager

Disk image transfer API  :

This API allowed the export of individual snapshots directly from the OLVM
manager. So instead of installing multiple Proxy VMs, you can have a
single external Node installation, which invokes APIs via the OLVM
manager.
In this article, I will cover how it can be overcome if the disk is stuck in
finalizing state.
Also, I have mentioned the Oracle meta link note : OLVM: Unable to put KVM
host to maintenance mode due to Image transfer in progress (Doc ID 2915392.1)
As mentioned in Figure 3, this is how it looks when disks are stuck in
finalizing state.

                   
                   
 Figure 3: disk stuck in finalizing state.

The best approach is the query the disk state in the OLVM engine. This will help you to understand which disks are stuck in finalizing status. 

Note: All the commands should be executed from the OLVM engine server.

Identify the issue



[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select command_id, phase, disk_id, last_updated from image_transfers;"
              command_id              | phase |               disk_id                |        last_updated
--------------------------------------+-------+--------------------------------------+----------------------------
 dcc47178-ebb1-47c1-900b-bc9753e12378 |     7 | 37d4046a-2705-4b55-9005-65567e50620c | 2023-04-29 23:55:31.678-04
 77a820ab-c580-4b46-9c0c-22102a0ce706 |     7 | a9b5b747-2fae-4b32-b839-2ea03dfcf35e | 2023-04-28 20:28:21.358-04
(3 rows)

[root@local-olvm-engine ~]#

Solution

As per the meta link note, you can update the image transfer status in phase 7 to either 9 failed or 10 completed depending on the situation.



[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "UPDATE image_transfers SET phase = '10' WHERE command_id = 'dcc47178-ebb1-47c1-900b-bc9753e12378'; "
UPDATE 1
[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "UPDATE image_transfers SET phase = '10' WHERE command_id = '77a820ab-c580-4b46-9c0c-22102a0ce706'; "
UPDATE 1
[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select command_id, phase, disk_id, last_updated from image_transfers;"
              command_id              | phase |               disk_id                |        last_updated
--------------------------------------+-------+--------------------------------------+----------------------------
 dcc47178-ebb1-47c1-900b-bc9753e12378 |    10 | 37d4046a-2705-4b55-9005-65567e50620c | 2023-04-29 23:55:31.678-04
 77a820ab-c580-4b46-9c0c-22102a0ce706 |    10 | a9b5b747-2fae-4b32-b839-2ea03dfcf35e | 2023-04-28 20:28:21.358-04
(2 rows)

[root@local-olvm-engine ~]#

Validate

Execute below mentioned command to validate the disk status, Also disk should be changed to the O.K state in the OLVM URL.


[root@sofe-olvm-01 ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select command_id, phase, disk_id, last_updated from image_transfers;"
              command_id              | phase |               disk_id                |        last_updated
--------------------------------------+-------+--------------------------------------+----------------------------
 dcc47178-ebb1-47c1-900b-bc9753e12378 |    10 | 37d4046a-2705-4b55-9005-65567e50620c | 2023-04-29 23:55:31.678-04
 77a820ab-c580-4b46-9c0c-22102a0ce706 |    10 | a9b5b747-2fae-4b32-b839-2ea03dfcf35e | 2023-04-28 20:28:21.358-04
(2 rows)

Conclusion

When an organization hosts a critical VMs server in the OLVM virtualization environment they need to plan their backup method. There can be a situation you have to recover the VM from the backup. Backup and recovery need to be tested and documented. 

To resolve disk state errors we need to update the Postgres database, I would recommend backup the OLVM engine before making any changes. Also better to consult an Oracle engineer to get a more precise understanding before
changing the image_transfer phase.