BRC - Troubleshoot AutoVerify

Written By Tami Sutcliffe (Super Administrator)

Updated at August 31st, 2023

[Source: https://portal.axcient.com/6688/configuring-the-autoverify-feature-data-integrity-checks/ ]

AutoVerify validates and tests the integrity and recoverability of your protected services. This feature performs a series of deep system integrity checks within the running virtual machine to ensure that the protected device backup is both healthy and ready to recover in the event of a disaster.

You might encounter one or more of the following errors when working with the AutoVerify feature:

  • The appliance may be unable to start the Virtual Machine. 
  • Communications with the VM may fail. 
  • Specific AutoVerify test cases may fail.
  • The protected device may have corrupted data.

 

The following ten examples give you an overview of ways to identify and resolve some of these issues.

 

1. Validation VM failed with error 123: No response from Boot VM and validation service timed out

#-----------------------------------#
#       SYSTEM VALIDATION           #
#-----------------------------------#

Recovery point used: <RP name/time>

2019-09-04 15:39:04.165852 Starting system validation ...

2019-09-04 15:46:27.858875 Error# 123: No response from Boot VM, Validation service timed out. Validation could not be completed within the time specified by the user. Current VM validation timeout is <VM timeout limit> hours/mins. If problem persists, please try increasing the VM validation timeout.

2019-09-04 15:46:27.858929 Validation failed

Possible reasons for this failure include:

  • The protected system image is corrupt or otherwise unable to boot as a VM.
  • Startup of the Operating System took longer than the timeout specified by the user (long-running pending Windows updates can cause timeouts).
  • Resources allocated to the validation VM are not sufficient.

To troubleshoot this failure:

  1. Navigate to the Device Configuration page and start the Test Failover process.
  2. If a pending Windows update installation is causing delays, reboot the protected device to allow the Windows Update installation to complete and then run a new incremental backup of the system.
  3. If the protected device will not boot, exhibits a Blue-Screen crash, or displays other errors, contact Axcient support for assistance in troubleshooting the underlying Boot VM failure.  [Return to top:]

2. Validation VM failed with error 124: Validation aborted since it was running for usually long time

#-----------------------------------#
#       SYSTEM VALIDATION           #
#-----------------------------------#

Recovery point used: <RP name/time>

2019-09-04 15:39:04.165852 Starting system validation ...

2019-09-04 15:46:27.858875 Error# 123: No response from Boot VM, Validation service timed out. Validation could not be completed within the time specified by the user. Current VM validation timeout is <VM timeout limit> hours/mins. If problem persists, please try increasing the VM validation timeout.

2019-09-04 15:46:27.858929 Validation failed

Possible reasons for this failure include:

  • The protected system image is corrupt or otherwise unable to boot as a VM.
  • Startup of the operating system took longer than the timeout specified by the user (long-running pending Windows updates can cause timeouts).
  • Resources allocated to the validation VM are not sufficient.

To troubleshoot this failure:

1. Navigate to the Device Configuration page and start the Test Failover process.
2. If a pending Windows update installation is causing delays, reboot the protected device to allow the Windows Update installation to complete and then run a new incremental backup of the system.
3. If the protected device will not boot, exhibits a Blue-Screen crash, or displays other errors, contact Axcient support for assistance in troubleshooting the underlying Boot VM failure\ [Return to top:]

3. Validation VM Failed with error 125: Boot VM got rebooted unexpectedly

#------------------------------------# 
#        SYSTEM VALIDATION           #
#------------------------------------#
Recovery point used: <RP name/time>
2019-09-02 13:50:42.965337 Starting system validation ...
[Errno 5] Input/output error (would remove this line unless this is helpful for debugging)
2019-09-02 13:55:40.930968 Error# 125: Boot VM got rebooted unexpectedly while running validation checks.
2019-09-02 13:55:40.931056 System validation failed

Possible reasons for this failure include:

  • A reboot might have been initiated by a Windows update process.
  • An unexpected crash occurred on the Validation VM Windows operating system.

To troubleshoot this failure, allow the Windows update process to complete on the protected device and then run a new incremental backup of the system. [Return to top:]

4. Windows system files integrity checks Failed with Error #1

#------------------------------------#
 #        SYSTEM VALIDATION           #
#------------------------------------#
Recovery point used: <RP name/time>
2019-08-28 09:28:57.168444 Starting system validation ...
Server output: 
Windows Resource Protection could not start the repair service.
2019-08-28 10:06:25.602904 Error #1: Error occurred error while running the Windows System File Checker tool
2019-08-28 10:07:18.415000 System validation failed

Possible reasons for this failure include:

  • The backup image data could be corrupt on the BDR.
  • The protected device could have file system errors.

To troubleshoot this failure:

  1. Run sfc /verifyonly on the protected system to determine if the problem lies with the backup or the protected system.
  2. If the problem is with the protected system:
    • Run sfc /scannow to repair the corrupted system files and schedule a reboot, if necessary, to complete the repair.
    • Enable the FULL SCAN backup mode for the next job run.
    • After one successful backup run, revert back to the original backup mode (OPTIMIZED SCAN / BLOCK LEVEL SCAN).
  3. If the protected system tests clean, you can attempt to repair the problem by performing a full backup scan of the protected system:
    • In the Job Edit page, select FULL SCAN’backup mode to synchronize any differences with the backup image.
  4. If subsequent Windows system file integrity checks fail, open a ticket with Axcient support for assistance. [Return to top:]

5. SQL integrity checks Failed with Error #10 Unable to launch MSSQL server

#------------------------------------#
#          MSSQL VALIDATION          #
#------------------------------------#
2019-08-28 09:28:57.168444 Starting MSSQL full integrity checks ...
Server output:  
MSSQL_TOOL="C:\Program Files\Microsoft SQL Server\100\Tools\Binn\\sqlcmd.exe"
The SQL Server (MSSQLSERVER) service is not started.
"Error! Cannot restart service" 
2019-08-28 10:07:18.414860 Error #10 Unable to launch MSSQL server, cannot proceed with MSSQL validation.
2019-08-28 10:07:18.415000 MSSQL validation failed

Possible reasons for this failure include:

  • Credentials provided in the UMC Data Integrity Checks page do not have sufficient privileges.
  • The MS SQL Server is unreachable.

To troubleshoot this failure, verify the credentials provided in the UMC Data Integrity Checks page. Windows or SQL administrator credentials are required to perform SQL integrity checks. [Return to top:]

6. SQL integrity checks Failed with Error #13: Cannot connect to MSSQL server using the Windows administrator credentials provided

#------------------------------------#
#          MSSQL VALIDATION          #
#------------------------------------# 
2019-09-02 15:35:00.760349 Starting MSSQL validation ...
Server output:
MSSQL_TOOL="C:\Program Files\Microsoft SQL Server\110\Tools\Binn\\sqlcmd.exe"
The SQL Server (MSSQLSERVER) service is stopping...
The SQL Server (MSSQLSERVER) service was stopped successfully.
The SQL Server (MSSQLSERVER) service is starting......
The SQL Server (MSSQLSERVER) service was started successfully.
Sqlcmd: Error: Microsoft SQL Server Native Client 11.0: Login failed for user 'admin'..
"Error! Cannot connect to MSSQL"
2019-09-02 15:35:00.760349 Error# 13: Cannot connect to MSSQL server using the credentials provided by the user. Verify the credentials provided.
2019-09-02 15:35:00.760441 MSSQL Validation failed

Possible reasons for this failure include:

  • Windows Administrator Credentials provided in the UMC Data Integrity Checks page are no longer valid.

To troubleshoot this failure, enter valid Windows Administrator Credentials to allow Axcient AutoVerify to perform SQL integrity verification. [Return to top:]

7. SQL integrity checks Failed with Error #20: Cannot connect to MSSQL server using the SQL administrator credentials provided

#------------------------------------#
#          MSSQL VALIDATION          #
#------------------------------------# 
2019-09-02 15:35:00.760349 Starting MSSQL validation ...
Server output:
MSSQL_TOOL="C:\Program Files\Microsoft SQL Server\110\Tools\Binn\\sqlcmd.exe"
The SQL Server (MSSQLSERVER) service is stopping...
The SQL Server (MSSQLSERVER) service was stopped successfully.
The SQL Server (MSSQLSERVER) service is starting......
The SQL Server (MSSQLSERVER) service was started successfully.
Sqlcmd: Error: Microsoft SQL Server Native Client 11.0: Login failed for user 'admin'..
"Error! Cannot connect to MSSQL"
2019-09-02 15:35:00.760349 Error# 20: Cannot connect to MSSQL server using the credentials provided by the user. Verify the credentials provided.
2019-09-02 15:35:00.760441 MSSQL Validation failed

Possible reasons for this failure include:

  • Windows Administrator credentials provided in the UMC Data Integrity Checks page are no longer valid.

To troubleshoot this failure, enter valid Windows Administrator credentials to allow Axcient AutoVerify to perform SQL integrity verification. [Return to top:]

8. SQL integrity checks Failed with Error #30: Corrupted pages detected

#------------------------------------#
#          MSSQL VALIDATION          #
#------------------------------------# 
2019-08-30 11:54:06.508736 Starting MSSQL validation .... 
Sever output:
MSSQL_TOOL="C:\Program Files\Microsoft SQL Server\110\Tools\Binn\\sqlcmd.exe"
The following services are dependent on the SQL Server (MSSQLSERVER) service.
Stopping the SQL Server (MSSQLSERVER) service will also stop these services.  
SQL Server Agent (MSSQLSERVER.
"There are errors at msdb.dbo.suspect_pages:"
Msg 1222, Level 16, State 24, Server WIN-61TOB9QECOP, Line 1
Lock request time out period exceeded.
"Error! There are corrupted pages"
2019-08-30 13:41:48.892012 Errors #30: Corrupted pages detected. Recovery point "<RP name/ time>" contains corrupted data. Running integrity validation on protected device is recommended. 
2019-08-30 13:41:48.892139 MS validation failed

Note: The AutoVerify SQL integrity check detects suspect pages when it inspects the the suspect_pages table in msdb. When the SQL Server Database Engine reads a database page with a logical consistency error (Error 824, see MSSQLSERVER_824 for more information), the page is considered suspect and its page ID is recorded in the suspect_pages table in msdb.

Possible reasons for this failure include:

  • The backup image data could be corrupt on the BDR.
  • Data could be corrupt on the protected device due to a problem in the I/O subsystem, such as a failing disk drive, disk firmware problems, or a faulty device driver.

To troubleshoot this failure:

  1. Review the SQL Server error log on the protected device and check for Error 824.
  2. Run hardware diagnostics and correct any hardware problems.
  3. Enable FULL SCAN backup mode for the next job run. After one successful backup run, revert back to the original backup mode (OPTIMIZED SCAN/ BLOCK LEVEL SCAN).
  4. If the protected system tests clean, you can attempt to repair the problem by performing a full backup scan of the protected system. In the Job Edit page, select FULL SCAN backup mode to synchronize any differences in the backup image. [Return to top:]

9. SQL integrity checks Failed with Error #40: Issues detected in MS SQL Database running on the validation VM

#------------------------------------#
#          MSSQL VALIDATION          #
#------------------------------------#  
2019-08-30 11:54:06.508736 Starting MSSQL validation .... 
"Warning! There are errors in validation DB" 
2019-09-04 11:53:44.787535 Error# 40: Issues detected in MSSQL Database on validation VM.
2019-09-04 11:53:44.787671 MSSQL validation failed

Note: This failure occurs when database consistency errors are reported by the SQL command DBCC CHECKDB that was run on the validation VM. DBCC CHECKDB is an SQL command that checks the logical and physical integrity of all the objects in a specified database.

Possible reasons for this failure include:

  • The backup image data could be corrupt on the BDR.
  • Possible issues with the protected device, including:
    • There may be a problem with a drive or SQL Server engine.
    • The file system of the database may be corrupt.
    • Damaged pages in the memory.
    • Issue or problem within a hardware system.

To troubleshoot this failure:

  1. Run DBCC CHECKDB on the protected device to check if there are any consistency errors with the SQL database running on the protected device.
  2. Run chkdsk on the protected device to detect and repair the file system corruption.
  3. Run Windows System Event Log check to find out if there are any I/O errors; or use the SQLIOSim tool to check the I/O integrity of the disk system.
  4. When issues on the protected device are fixed, enable FULL SCAN backup mode for the next job run. After one successful backup run, revert back to the original backup mode (OPTIMIZED SCAN/ BLOCK LEVEL SCAN).
  5. If the protected system tests clean, you can attempt to repair the problem by performing a full backup scan of the protected system. In the Job Edit page, select FULL SCAN backup mode to synchronize any differences within the backup image. [Return to top:]

10. Exchange database integrity checks failed

#------------------------------------#
#          EXCHANGE VALIDATION       #
#------------------------------------# 
2019-09-06 15:58:22.858457 There are some errors(#1) occurred during Exchange full integrity validation
2019-09-06 15:58:22.858576 Validation failed  

An error occurred during VM validation (timeout 900 min).
Errors occurred during task execution.

Note: This error is thrown if the Exchange database could not be mounted on the validation VM.

Possible reasons for this failure include:

  • The backup image data could be corrupt on the BDR.
  • Possible problems with the protected device, including:
    • The file system of the database may be corrupt.
    • Damaged pages in the memory.
    • Issues or problems within a hardware system.
    • Antivirus software may have deleted or modified transaction logs,

To troubleshoot this failure:

  1. Use the ESEutil utility to detect and repair a corrupted Exchange database:
    • Run ESEutil /K on the protected device to check if EDB data or log files are corrupted.
    • If corruption is detected on the protected device, run ESEUTIL /K to repair.
    • Enable FULL SCAN backup mode for the next job run. After one successful backup run, revert back to the original backup mode (OPTIMIZED SCAN/ BLOCK LEVEL SCAN).
  2. If the protected system tests clean, you can attempt to repair the problem by performing a full backup scan of the protected system. In the Job Edit page, select FULL SCAN backup mode to synchronize any differences within the backup image.
  3. If subsequent EDB integrity checks fail, open a ticket with Axcient support for assistance. [Return to top:]