With the increasing role of data in our lives, it has become increasingly important to plan for the possibility that it all might go wrong one day. And if you think about it, it’s not only an event that might happen, it probably will.
March 31st is World Backup Day, raising awareness of the importance of backing up your data. But for the other 364 days of the year, we would like to provide some useful information and offer our tips for keeping your data safe when the inevitable event occurs.
Approaches
Let’s start by making sure we understand what ‘backup’ actually means, as there are various terms and approaches in use these days:
Backup
A backup is technically an archive copy of the data that is in use (in the sense that it is archived until you need to use it).
Replication
Copying data and moving it to another location. With services like Microsoft Office 365 and Google G Suite for Education, replication is taking place more and more.
Synchronisation
Similar to replication, synchronisation is the process of copying files and folders in use to another location.
Archiving
Not a copy of data, archiving is the process of storing data that is no longer in use, but is (or may be) still needed, in another location.
Whilst replication and synchronisation have certain advantages, it is important to see them as a complement to backup, rather than a replacement.
Backups are taken, and fixed, at a point in time. This means that the data included in the backup is held in that state until needed. When further backups are completed, more versions of the ‘fixed state data’ are created, providing a range of choices if restoration is required.
Replication and synchronisation are generally iterative processes, where changes to live data are subsequently incorporated into the copy data. The additional ‘version’ of the data is updated when changes take place (rather than multiple versions being created).
Whilst it is possible to establish multiple replication or synchronisation locations, and in doing so achieve multiple versions of the data, it can require a great deal of storage (and therefore expense).
What Backup is for
Why is it important to have multiple versions of the copied data? Because backups essentially have two functions:
- To enable the recovery of data after it is lost (either through corruption or deletion); and
- To enable the recovery of data from an earlier time.
Whilst other approaches (e.g. replication) can provide a convenient means of recovering a user’s lost file, certain incidents affecting data (e.g. malware infection) could also affect the replicated data.
File and Image Backups
There are two main approaches to backup: file-level, and image-level.
A file backup makes an archive copy of each of the files and folders that are specified. The software used to perform the backup can usually also allow the ‘applications’ and ‘system state’ to be backed up, which makes restoration in the event of a disaster possible.
An image backup is essentially a clone of the entire hard drive, and therefore contains all the data (including the applications and the system state) but in a completely different way.
Image-level backups require more storage space, but are faster and more flexible for recovery. Both should allow the restoration of individual files and folders, where ‘minor’ incidents have occurred, though image-level backups will be slightly more complicated. Both should also allow full system recovery, where more serious incidents have occurred, but image-level backups can perform ‘bare metal’ restoration much more quickly, and more easily support restoration to different hardware.
Backup Types
There are a number of different backup types, including:
Backup Type | Advantages | Disadvantages |
---|---|---|
Full Backup: A full data and/or system backup, as the name suggests. With a full backup, all data is copied to another location. | A complete copy of all data is made in a single location, simplifying restoration. | Takes longer to complete than other types of backup. |
Incremental Backup: Only the information that has changed since the last backup is included. | Can be carried out more quickly and frequently than a full backup, as less information is being processed. | Recovery can be complex, as the information that needs to be restored may be spread across multiple backups. |
Differential Backup: Similar to incremental, but the including information that has changed since the last full backup (whereas incremental is changed data since any backup). | Quicker than full backup, and simpler to restore than incremental backup. | Takes longer to complete and requires more storage space than incremental backup. |
Full backups are required from time to time, and incremental or differential backups can be used effectively between full backups.
A common schedule would be:
Monday | Incremental or differential backup |
Tuesday | Incremental or differential backup |
Wednesday | Incremental or differential backup |
Thursday | Incremental or differential backup |
Friday | Full backup |
Backup Media
There are many types of different backup media, including:
- Tapes (which require a tape drive);
- Hard drives (which could be internal or external);
- Solid state storage (e.g. USB sticks); and
- Cloud storage.
Each has advantages and disadvantages, which should be considered in the context of the tips below.
Backup Process
Whenever data is created or changed, a backup requirement exists. In order to ensure an appropriate backup process is in place, certain objectives need to be considered:
- Recovery Point Objective (RPO): this is the length of time between backups, and dictates how recent or current the restored data will be following a loss. For example, a daily backup, taken overnight and started at 18.00, could result in the loss of a days’ data if an incident occurred at 17.30.
- Recovery Time Objective (RTO): this is the amount of time it takes to recover the data after an incident, and should be considered in the context of the impact on the organisation. For example, if an incident occurs that requires a full restoration, and that restoration using a given backup technology would take two days, does the organisation consider that acceptable?
- Backup security: as the purposes of backup are protecting data from loss, consideration should be given to the security of the data within the backup (i.e. on the backup media). The two main techniques are encryption (i.e. encrypting the data on the backup media) and physical security (i.e. the way in which the backup media is handled and stored physically).
- Retention period: certain regulations will require particular data to be held for a prescribed amount of time, and other regulations will require that data (and other data) to be held for no longer than that prescribed time. The organisation will also require the confidence that restoration of data can cover a suitable time period, but also that the cost of storing backups is well managed.
A good backup process should address these objectives.
Tip 1 - Prepare for the Worst
Having a regular and automated backup process in place is the key to being prepared for a data disaster. These days, backing up important data any less than daily is likely to be an unnecessary risk. And scheduling backups to take place automatically, and start at a prescribed time, is straightforward.
Create a step-by-step plan that sets out how the organisation would recover from a critical incident (e.g. a fatal hardware failure in a key server). Or if you already have one, review it. Who would need to be involved? Are all the necessary tools, services and information in place? How long do you think it will take?
Tip 2 - Implement a 3-2-1 Backup
The accepted best practice rule for backup is called the ‘3-2-1 backup’ rule. This means that, when backing data up, you should:
3. Have at least three copies of the data (so that in addition to the live data, you have at least two other copies). This is to mitigate a failure on both the main device (storing the live data) and the backup device (storing the first copy of the live data). A common approach to this is ‘disk-to-disk-to-tape’:
Disk | to | Disk | to | Tape |
the hard disks on the server(s) storing the live data | hard disk(s) on a backup server, where the live data is backed up to initially | through a tape drive attached to the backup server, after the initial backup it is copied to tape |
In this example, the ‘tape’ could be replaced by a different backup media.
If the risk of failure of each device is, say, 1%, then the likelihood of failure is:
- Where there are two copies of the data, 1 in 10,000
- Where there are three copies of the data, 1 in 1,000,000
2. Store backups on at least two different media. This, again, is to reduce the risk that a failure of a device impacts the recoverability of data. A failure of disks on one server could indicate an increased likelihood of disk failure in another server (e.g. they may be of similar specification and age, and subject to similar environmental influences and use patterns). By also using another media (e.g. tape, or cloud storage), the risk is reduced.
1. Store one of those backups offsite. Storing one of the backups in a physically separate location is important, as it’s the only reliable way of protecting against issues like fire, flood and other disasters. This can be achieved by physically taking tapes to another (secure) location after the backup, or using a cloud storage service.
Adopting the 3-2-1 backup approach should ensure the survival of at least one copy of your data in the event of a serious incident.
Tip 3 - Tailor your Backup to your Data
To have good RPO, as well as not waste computing power and storage making constant backups, you may consider the fact that not all data needs to be treated equally. Some data changes regularly, and is in use more than other data (sometimes called ‘cold data’) that is updated less regularly.
You would need a lot of computing power and storage to be able to store a full backup of everything every day, and whilst it may be possible (depending on how much data you have), if your data grows it may become less practical, and you’d need to rethink.
Prioritise your data to make daily backups of the changing data, and less frequent (e.g. weekly or monthly) backups of your cold data. Use of incremental or differential backups can help, whilst still enabling multiple recovery points.
You also need to check that you’re backing up certain things effectively. If you’re using Microsoft Office 365 or Google G Suite for Education, you probably won’t have a Microsoft Exchange Server on site, but if you do it needs special backup treatment. So too does Microsoft SQL Server or MySQL, which could be running away inside your Management Information System (MIS).
Tip 4 - Practice Recovery Scenarios
You may already have a good backup regime, but when was the last time you tested it to check it’s working as it should?
You can do some simple tests, such as creating a test file inside a folder that will be backed up, running the backup, then deleting the file from the ‘live data’ and attempting restore from backup.
You should also consider some more in-depth tests, involving larger and more complex data sets. However, it’s not usually advisable to put healthy live systems at risk in order to test it, so consider this as part of any future ICT works or use old hardware for testing purposes.
Tip 5 - Consider a Centralised Approach
Being able to backup data that is in one location is likely to be simpler, and therefore more successful, than trying to backup data that exists in many different places.
If you’re not yet using cloud storage for live data, it may be worth thinking about doing so. Not only can you ease the backup and restore process, but simplify other operations (such as providing remote access to files).
You may also be able to benefit from ‘replication’ as part of a backup strategy, and then add additional backup technologies and processes to achieve the purposes and objectives set out here.
If you’d like some more information or advice, get visit our Security area.