Testing SQL Server backups and offloading integrity checks

We have a fair number of pretty large databases, and I was having a hard time scheduling full integrity checks to run at least once a week. For a long time, I only ran physical_only with my fingers crossed, but finally, I got a new boss and a new server in order to restore (and therefore test) SQL Server backups. Plus, we were acquired by another company that wanted us to test backups! Yay! The stars aligned! I wanted to do this for a long time but lacked the server with enough resources to do it. I got a dev server to avoid licensing costs, and I just restore down there and do integrity checks on the large databases.

In fact, I restore smaller databases that I do integrity checks on in place in prod, just to make sure they would restore if needed. Plus, our parent company wants everything restored at least quarterly so I figured might as well restore the smaller ones at least weekly, and this way, I already have scripts in case prod blows up and I have to restore them from backup (hopefully that never happens!).

Here’s my process:

  1. Get a dev SQL Server up and running with enough storage to hold multiple terabyte-sized databases. I don’t restore and do integrity checks on all of them at the same time, but in case there is overlap, I don’t want to have to worry about disk space constraints.
    Here are the specs on my dev server for reference:
    • 4 cores
    • 64 GB RAM
    • 12 TB drive for data/log files
    • Separate drive for system database as per our usual setup
    • Separate drive for TempDB as per our usual setup, and here I made it 500 GB
  2. Write production backups to a centralized location (which you should be doing already) that’s also accessible in dev. I use Ola Hallengren’s scripts for this to have the full/diff/log backup folder structure and because Ola scripts are the best anyway:
  3. Restore with a SQL Agent job with Greg White’s DatabaseRestore stored procedure script. If you are using Ola’s backup solution, this database restore script simplifies things greatly and you can loop through a folder structure to find the correct backup files to restore. If you aren’t using Ola’s scripts, you may be able to use this database restore script, but that’s up to you to sort out how. I put the DatabaseRestore stored proc in the master database, but that’s also up to you where you want to store it.
    DatabaseRestore script: https://github.com/GregWhiteDBA/MSSQLAutoRestore
    Nice explanation of DatabaseRestore script with sample calls: https://www.brentozar.com/archive/2017/03/databaserestore-open-source-database-restore-stored-procedure/
  4. Run an integrity check on large databases with SQL Agent job. I use Ola scripts for this.
  5. Drop the database with SQL Agent job (I do this because then the VM backups of the server don’t go crazy backing up things that don’t need to be backed up)
  6. Schedule job to as close to when the backup completes so you don’t have a large gap in time before you can find out if your backup is corrupt.

The restore, integrity check, and database drop in the list above happen in the same agent job.

I only restore the full backups to check for corruption because that’s my biggest concern with high transaction, large databases, but the DatabaseRestore script will nicely let you restore full, diff, and logs if you want that as well. I’ve only seen corruption happen in full backups. At a previous job, we had multiple databases that were 6TB in size and we restored the full backups weekly, and at least every 2-3 months, one of them would be corrupt. This is why I think it’s so important to actually restore your backups. You can verify the backup as part of the backup job, but I find this to be a waste of time because it doesn’t guarantee you can actually restore it, and it really extends the time it takes the backup job to run.

Here’s a sample job I use in Github. DO NOT USE IN PRODUCTION unless you want to wipe out your database and restore it from backup.

One thing that’s really important is that your integrity checks actually dump to a file because you aren’t going to see the details of integrity check errors in the job history. I’ll be writing another blog post on how to do this soon. For now, here’s to hoping you don’t have any corrupt backups or corrupt data! Happy restoring!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.