Disasters can range from a critical database crashing to a ransomware attack to the complete stoppage of your business operations. Whatever happens, two concerns top the list: how long will it take you to recover, and how much data will you lose?
To make sure you don’t experience more downtime and/or data loss than your business can withstand, you need to define two key metrics that will guide disaster recovery planning for each of your critical workloads:
- Recovery Time Objective (RTO), the time between the onset of an outage and a return to business as usual
- Recovery Point Objective (RPO), the maximum amount of data your business can lose before experiencing significant impacts, measured by volume or time
Ideally, all your critical applications would failover automatically and have continuous real-time backups, so your RTO and RPO would both be zero. But while technically possible, these measures are too costly for many businesses to contemplate.
This is why you need realistic RTO vs RPO calculations to help design your recovery process and determine backup methods and frequencies.
RTO vs RPO both start with application priority
When thinking about RTO vs RPO, it’s not a question of which metric is more important because they are interrelated. How long it takes to recover impacts how much data you’ll lose, and how much data you can afford to lose drives how fast you need to recover.
A best practice is to factor application importance and priority into your RTO vs RPO calculations. For example, the most mission-critical customer-facing services with the highest cost per minute of the outage may require a near-zero RTO, necessitating failover capabilities. Other applications, such as an HR database, might be OK with an RTO of 24 hours.
With RPO, the more difficult it is to recover or recreate the data, the shorter the associated RPO should be. Mission-critical databases might require continuous, real-time replication for a near-zero RPO, while an RPO of 24 hours might suffice for data that you can recreate from your last backup using “pen and paper” sources.
Another best practice is to make your RTO vs RPO estimates as accurate as possible. If you’re too conservative, you’ll end up spending more than you need to on backup services, redundant systems, etc. If you’re too lax, your business could face unacceptable risks that threaten its continuity and even survival.
In the end, it’s about balancing your IT budget against the business value/criticality of your IT services and associated data. Costs for backups, storage, and other technology add up quickly. But so do the costs of downtime and data loss—from lost sales to lost customers to a damaged reputation that can haunt you for years. A further RTO vs RPO consideration in regulated industries is whether the loss of certain irreplaceable transaction data (e.g., medical records or other personal data) could constitute a compliance violation and result in sanctions.
What are the “right” RTO vs RPO numbers for your workloads? It’s often not easy to arrive at answers that satisfy all stakeholders. You’ll need to start by meeting with senior leaders to identify which systems and databases are central to operations and/or produce the most revenue.
RTO vs RPO values for individual systems also needs to be embedded in the context of a risk-based, prioritized disaster recovery or business continuity plan for all your IT systems and databases in the aggregate. Balancing the factors mentioned above, from the nature of the event to the importance of the application/data to regulatory risks, takes objectivity and experience. Then there’s the question of how best to leverage technology to meet your agreed RTOs and RPOs.
For expert help with defining, selecting, and implementing the perfect disaster recovery solution for your databases, your business, and your budget, contact Buda Consulting.