Recently, we ran into a problem which saw “VMware VirtualCenter Service” on one of our vCenter servers crashing after every restart. In my last post here, I documented the whole incident and its solution. If you read the post, you’ll see that I had to delete a considerable number of accumulated snapshots from the affected VM to bring it back into operation. As these snapshots were accumulated because of a misbehaving VMware Data Recovery appliance, I shut it down before deleting the snapshots, so that it doesn’t contribute to the problem anymore while I fix the problem.
As mentioned in that post, the problem was fixed by my actions but the act of deleting all those snapshots, introduced another problem: The VMware Data Recovery appliance would not come up and failed with an error: “Unable to access file <unspecified filename> since it is locked”. At first, it didn’t make any sense but thinking about it made me realise that if the appliance was not cleanly getting rid of snapshots it was taking to back up the VM (which resulted in the accumulated snapshots), last state of this appliance (just before I shut it down) will have the affected VM’s disk snapshot mounted. That snapshot, if you remember the post, wasn’t there anymore as I deleted them all to recover the VM!
If you see the same issue for whatever reason, here is what you’ll have to do to fix it:
- Right-click on the VDR appliance in vCenter Console and go to “Edit Settings”.
- Look at the Hard Disks. Chances are that the first one is the VDR appliance itself. You might have more backup datastores mounted. Hopefully, you’ll be able to recognize them by their location and size.
- For any hard disk you can’t recognise (hopefully, just one of them), you should be able to tell from what’s in the “Disk File” field. If it’s pointing to the affected VM then it’s your culprit! Also, look at “Mode” having “Independent” check box ticked – that’s also a pretty reliable clue.
- Once identified, remove this hard disk by clicking “Remove”. Make sure you keep the default “Remove from virtual machine” before clicking OK. Don’t select the option to delete the files from disk also!
- Your VDR appliance should be able to start now.
- Once up and running, look into its “Backup” tab. Run the job already there to verify it can now backup the VM reliably and is getting rid of the snapshot at the end of the process.
- Once confirmed, go to the “Restore” tab. You’ll find a new chain started for this machine. That will be your reliable backup chain for the future. Mark to delete the old backup chain as it will no longer be reliable.
Keep an eye on the backup for a while just in case but hopefully, it’ll keep running smoothly from now on.