Why Backup?

To protect data from:

  • Human Error: Accidentally applying DROP or DELETE on a table
  • Hardware or Software malfunction
    • Application bugs
    • Crashes
    • Component failure
  • Malicious attacks
    • SQL Injections
  • Natural disasters
  • Government/Legal changes

Failure

Failures can be divided into the following categories:

  • Statement failure: Syntactically incorrect, e.g. mistyped SQL statement
  • User Process failure: The process doing the work fails (errors, dies)
  • Network failure: Network failure between the user and the database
  • User error: User accidentally drops the rows, table, database
  • Memory failure: Memory fails, becomes corrupt
  • Media Failure: Disk failure, corruption, deletion

Backup

Simply put, a backup is simply a copy of data that can be restored. In most cases, this can be done by simply copying (and overwriting) a backup over the existing data.

Recovery is the process of applying a backup or using log files to try to restore lost or old data.

There are many classifications of backups:

  • Physical vs Logical
  • Online vs Offline
  • Full vs Incremental
  • Onsite vs Offsite

Backup Types

Physical
  • Raw copy of files and pages in database
  • Suitable for large databases, when fast recovery is needed
  • Ideally, should also be an offline (cold) backup.
  • Includes transaction log
  • Only portable to machines of the same configuration (OS, application, etc.)
  • Restore process:
    • Shut down DBMS
    • Copy (& overwrite backup)
    • Restart DBMS
Logical
  • Done via SQL queries
  • Slower than a physical backup
    • Hardware independent, since using SQL not OS copy
  • Output takes up more storage than physical
  • Does not include config or logs
  • Server is available during backup i.e. can be an online backup
  • In MySQL, done via either:
    • MySQLDump
    • SELECT [query] INTO OUTFILE
  • Restore process:
    • mysqlimport
    • or LOAD DATA INFILE
Online
  • Also called hot backups
  • Occurs when database is live (can still be used)
  • Need to ensure proper locking is in place to ensure integrity of data
Offline
  • Also called cold backups
  • Results in downtimes (when no users can use data)
  • Easier to perform, but should not be done in applications that should have very low downtimes (hospitals, for example)
  • Sometimes, backups are taken from a replication server, which is a copy of the original database (and synced)
Full vs Incremental
  • A full backup contains all the data in the database
    • Recovery is simple, just copy the backup
  • Incremental backups only store any changes made since last backup
    • In most cases, these are just the transaction logs
    • Recovery requires applying all incremental backups from failure.
    • Recovery process:
      • Freeze database
      • Copy backup log files
      • Ask database to re-apply log files

A backup policy is the schedule of full and incremental backups to be applied. A good backup policy conducts backups when the database load is low (not many users using it).

An example of a backup policy is doing full backups on the weekends, and incremental backups on the weekdays.

Onsite vs Offsite

I