Resilience and Business Continuity planning

Deciding on what levels of resilience to implement depends on a number of factors including the budget, Service Level Agreements, the type of media being handled, acceptable down-time limits and support agreements. It is therefore important to assess the processes (with a Business Continuity Plan for example - see Infinity Business Continuity Plan Template) so you can determine the level of impact on your business and associated risks.

The move to server virtualisation is becoming more and more common and allows you to take advantages of the instant replication and disaster recovery models these solutions offer.

Hardware

The initial steps in improving hardware resilience include duplicating key server components to reduce single points of failure for example dual power supplies, network interfaces, hard drive raid configurations and other redundant components on standard server hardware.

The next steps would be to duplicate the entire server with warm/ cold failover depending on the environment and configuration. This includes looking at the other servers/services and LAN/WAN infrastructure supporting the solution.

Server virtualisation offers a number of benefits when adding server redundanc however you also need to ensure you have adequate hardware redundancy within your VM environment.

Software

The system can be configured in a number of ways to suit your requirements allowing configuration of multiple servers and providing failover functionality to key services should issues occur. Some of this will coinside with the hardware resilience plans you have.

For example:

  • User configurations allow automatic failover to another web server should the primary fail
  • Easy rerouting of all key services to be in a shared or standalone server environment (IIS, Licensing, gateway, email)
  • Spread SQL activity between multiple servers (active/active or active/backup) for system databases and individual projects and reporting databases

Database

The main system, project and reporting database locations are fully configurable allowing you to run on a single or multiple SQL servers. This also means in the event of a failure you can easily reconfigure to another server providing you have the relevant backup and restore or replication processes in place.

To ensure the least amount of data loss should an issue occur it is advised that databases are run in full mode with regular transaction log backups throughout the day (for example half-hourly) and a full back-up each evening.

Storage and Backups

When calculating backup and storage requirements you need to consider:

  • The location of database files, transaction logs and backups on separate physical disks/arrays to limit loss in the event of hardware failure
  • The frequency of the backup to limit data loss should an issue occur. For example typical transaction log backups every 30 minutes
  • The growth speed of the databases and logs
  • The restoration time required using with the chosen backup model
  • The data access requirements against longer term archive (who needs to access the data and when)
  • The length of time you need/want to store the data depending on client retention and regulatory compliance rules
  • Off –site storage to protect against fire or in-accessibility