Since IT became popular and an integral part of our personal and professional lives, society has grown accustomed to needing to regularly back up their irreplaceable data. Despite advances in technology, including the rise of cloud backup services, accidents and unexpected situations will always occur, and because of this, there will always be the need to safekeep our digital data. In the words of Werner Vogels, the CTO of Amazon: “Everything fails all the time.”
Organizations, and cautious individuals, have established backup processes to gain some peace of mind and be able to face adversities. Yet, are those backup systems reliable? As we’ll see below, almost one third of all the backup copies in use today are ineffective.
In this article, we will explore how to assess if a backup system is trustworthy, take a look into the most common backup pitfalls, and share insights into how organizations can ensure they are protected.
This is part of our series of articles about Cloud Backup.
Use the links below to jump down to read about:
Similar to an insurance policy that doesn’t actually cover you when an accident happens, research shows that backup jobs often don’t work when data restoration is needed.
In fact, the situation is so bad that a recent survey found that 37% of all backup jobs and 34% of data restores fail. In practice, this means that almost one-third of worldwide backups are useless. This can easily become any organization’s worst nightmare. Incidents are bound to happen in IT operations and, when they do, you are going to have to recover data.
Why is this happening? While the root cause can vary, one of the biggest challenges of today’s backup requirements is the staggering size of data. Data is scaling up into the petabytes at many enterprise organizations, and the old backup methods can’t work at that scale. Therefore, organizations must urgently reevaluate their backup strategy, processes, and technology capabilities.
While the rapidly growing volume of data plays a key role in the lack of reliability of backup jobs, there are many pitfalls that are commonly associated with failed backups—independently of their size—that every storage admin should be aware of. Let’s take a look at each of them below.
Every admin that ever planned and implemented a backup process needs to indubitably address the question: When should the backup job execute? Finding the optimal backup window—the ideal time slot to back up your system and its data—is a challenge. Choosing the wrong backup window (e.g., when the system is being heavily utilized by customers) is a common cause of failed backup jobs. Moreover, executing a backup job at the wrong time often creates constraints on your system by negatively affecting its performance and depriving it of the bandwidth needed to serve customer requests.
Finding the time window that will have the least impact on system performance, such as non-peak hours, and shortening the time it takes to complete the backup are key to success. Furthermore, it’s worth noting that a similar challenge can occur during the backup restore process. Even with backups properly executed, if the time it takes to retrieve and restore the system is too long and complicated, the usefulness of the backup solution is highly questionable.
A traditional technology that comes to mind when thinking about enterprise backup solutions is tape cartridges and other magnetic storage devices. Seasoned IT specialists can fondly remember how easy it is for information stored on magnetic devices to be corrupted and how sensitive it is to handle that hardware.
Thankfully, technology has evolved, but even some modern solutions have their pitfalls. One simple, yet very problematic, issue that applies to any technology is storage capacity. Ensuring that the backup solution has an adequate storage capacity to handle the volume of data and the frequency of backup jobs is incredibly important but often overlooked.
While cloud-based backup solutions, with nearly infinite storage capacity, are able to cope with demanding growth needs, it’s important to consider their costs. Defining a proper backup retention policy and making sure that newer backups can be performed is always crucial, regardless of the technology chosen.
Then there is the issue of unpredictable costs for a system backup and restore. This is a common challenge that drives many organizations to over-optimize their backup solution to make it affordable and lower running costs. This mindset is a trap that often gives the illusion of protection, while in reality you might end up spending several times more the operational cost of the system being protected. Or, because of poorly optimized configurations, this approach may not let you recover data when it is actually needed.
When designing a backup strategy, people often only consider risks and incidents that might occur in the digital realm. Our minds are immediately drawn to scenarios where the company hardware servers (or virtual instances) might fail, or malicious attackers compromise the systems we are trying to protect. While these are valid risk scenarios, we often overlook the world's physical threats to—and their implications for—our organization’s operations.
A common pitfall in failed backup jobs is the location we choose to store data. Not having a backup copy in an off-site location (e.g., another cloud region or corporate office) is especially challenging in situations where natural disasters, or even just a prolonged power failure, affect the data center or geographical area where the system and backups are operating. It is worth noting that these incidents are not as unusual and far-fetched as one might think. Even the top public cloud providers experience these types of outages. The best backup solution and process won’t be able to help you if it’s inaccessible along with the system it's protecting.
One important factor to consider, both with on-site and off-site backups, is the security measures that are applied to that data and process. When malicious attacks are targeting the data you are protecting, one possible attack vector is trying to access and compromise your backups. Another common mistake is to invest a significant effort in tightly protecting a system but overlooking its backups. If the system is encrypted but the backup data is not, that makes it possible and likely that an attacker will attempt to compromise, exfiltrate, and perhaps leak the information they seek via the organization’s backup solution.
There are many aspects to consider when planning a backup and, as we explored above, different pitfalls to avoid. The question still remains, how can we trust the backups we are performing actually protect our organization as intended?
Ensuring your organization is secure depends on having the right backup technology solution and making sure your IT team is ready to leverage it when needed.
Traditionally, a lot of emphasis has been put on executing backups but not so much on using them. With the emergence of chaos engineering and site reliability engineering (SRE), part of the focus on ensuring a system is reliable shifted to improving the mindset and practices of the IT engineering team. One of those practices, which rapidly became popular, is Game Days.
On a game day, the engineering team is faced with a simulated failure scenario in order to learn how to react and what actions to take to restore a system to its healthy state. This usually takes place in an isolated environment where the team can attempt to restore their backup data and take different steps to recover a broken system.
In a more realistic and advanced game day scenario, usually conducted by more experienced engineering teams, you might inject faults in a live production system to test its resilience and ability to automatically recover.
Regardless of your team’s level of maturity, organizing a game day and testing your disaster recovery readiness is highly recommended. Not only will this enable people to gain trust in the integrity of their backup data, but it will also allow them to practice how to react when an unexpected event happens.
Without any doubt, the technology you adopt plays an enormous role in ensuring your organization has a reliable and trustworthy backup and disaster recovery strategy.
With the rapid changes in technology today, not only has the way business systems operate changed drastically but also the backup strategies and solutions that protect them have been forced to evolve and adapt as well. When choosing a cloud backup solution, it is important to understand today’s best practices when it comes to implementing a backup strategy.
The explosive growth of data has played a decisive role in making traditional backup solutions and methods nearly obsolete. Today, tape storage, due to its cost and inability to cope with data growth, is being replaced with modern cloud technologies that are able to scale, provide cost efficiencies, and, overall, make data restoration simpler and faster. Moreover, cloud-based solutions make it easier to lower the total cost of ownership by offering different backup archive tiers that can be adjusted to the protection level and data retrieval times required by your business needs.
The shift to SaaS backup solutions, combined with inherent public cloud capabilities, has also opened the door to new backup strategies (e.g., 3-1-2 or 3-2-2) that can be defined and adapted without major IT engineering efforts.
Furthermore, these modern backup solutions are now able to answer an organization's complex cyber resilience needs. Protecting data today is increasingly challenging because, in addition to any unexpected incidents that might occur as part of normal operations, organizations need to take into account malicious attackers.
Cyber threats such as ransomware, unauthorized access attempts, and data leaks are a burden that a modern backup solution can help ease with immutability features such as Write-Once-Read-Many (WORM) capabilities.
There are many aspects to consider when planning backups and disaster recovery. Avoiding the common mistakes made is crucial, but as we also saw, a great backup strategy is a combination of fostering a good company culture and having the right processes and technology solutions in place.
When it comes to a cloud backup solution, NetApp Cloud Backup is a great product to consider. Adopted by numerous enterprise businesses, the NetApp solution makes it easy for organizations to avoid the typical pitfalls of backup strategies. Moreover, with a wide range of advanced features, such as WORM file locking and the dark site deployment option, it makes it easier to protect your data against accidents and cybersecurity threats.