File Archiving and Backup: Preventing Data Loss in the Cloud

Written by Aviv Degani, Cloud Solutions Architecture Manager, NetApp | Jun 30, 2019 9:09:27 AM

File archiving and backup are similar processes that are both essential to cloud file sharing at the enterprise level. Corruption, security breaches, or human error could lead to data loss can jeopardize a company’s reputation, their financial standing, and expose it to legal consequences. Backup and archive solutions can prevent that from happening, but costs will also be a factor.

What kind of solutions do the major cloud providers offer for backup and restoration and file archiving? Are they going to be sufficient in preventing data loss for enterprise file shares, which can be so heavily used that backup data must be kept in sync at all times? And how can you do that without consuming entire IT budgets? In this post we’ll look at backup and archiving in the cloud and explore the cost-effective and efficient ways Cloud Volumes ONTAP can protect and archive file share data.

What is Data Backup?

The backup process creates replicas of the data in use at the primary location to prevent data loss and corruption. These backup copies are stored in a secondary location. It’s important that the backup process keeps the backups in sync with the most up-to-date data in the primary data location in order to prevent any loss of data.

What is Data Archiving?

Archiving is a process for storing data for long-term retention. The underlying process is similar to backup, except archive data does not need to be kept in sync with the latest changes in the primary storage location—it is stored for historical or compliance and legal purposes.

Backup and Archiving Challenges for File Services

Having a cost-effective and application-aware backup and archiving solution for a cloud-based file share can be a big challenge. Without careful consideration, the costs can go sky high. File shares in the cloud need to leverage more cost-effective storage formats to affordably handle file archiving and backup.

Snapshot frequency is another limitation: does the service provide one and if so, does it perform them frequently enough to ensure no data is lost? Also, with most of the cloud service providers, users can’t influence transfer rates. This can both limit the efficacy of backups and drive up costs. Some providers don’t offer a solution for application-aware backup, which is another avenue for data to be lost or corrupted.

All the hyper-scale cloud providers offer managed shared file storage services today, and all of them come with some method for backup and archiving file share data, as we’ll see below.

Amazon Elastic File System (Amazon EFS) uses an AWS backup mechanism or a number of make-do solutions to solve the backup challenge. AWS DataPipeline can be used to back up data to a secondary file system but must be set up manually. But this method can be extremely costly. Backup and archive data can be sent to Amazon S3, however, there isn’t a built-in way to do that, and the methods to do it require a certain level of configuration.

This isn’t the case for all of the AWS cloud file services. Though Amazon EFS does not support snapshots, Amazon FSx does. These snapshots can be created on a daily basis, or they can be created by the administrator manually. Because these snapshots are incremental, consumption is not as high as it is on Amazon EFS because only the changed data is synced over. However, daily snapshots may not be sufficient to protect an enterprise file share and the manual effort involved in creating backups to do that is challenging. Like Amazon EFS, there isn’t a built-in way to use Amazon S3 as a capacity tier.

Azure Files offers users SMB v3.0 file shares in the Microsoft Azure cloud. A big advantage in backup and archiving in Azure Files is that users can create read-only, incremental snapshots to easily protect their data. However, users are also limited to a total of 200 snapshots, after which the older ones need to be deleted. There also isn’t a tiering capability to less-expensive storage on Azure Blob.

Cloud Filestore, is the fully managed high-performance file system for NFS, provided by Google Cloud. Cloud Filestore has built-in zonal storage redundancy to protect data from failures or downtime during maintenance, but Google recommends performing backups with a certified third-party solution.

Backup and Data Archiving with Cloud Volumes ONTAP

Cloud Volumes ONTAP is a sophisticated solution for file-based cloud storage management in AWS and Azure that is based on NetApp’s on-premises SAN and NAS storage operating system. Cloud Volumes ONTAP provides file share services with all versions of NFS, CIFS/ SMB, as well as iSCSI in the cloud with multiprotocol access, and storage efficiencies that reduce footprint and costs that are well known in their on-premises systems. Replication technology is done on block level and is very efficient.

NetApp Snapshots™ are the primary building block for data protection in Cloud Volumes ONTAP. Snapshots are instantly created and extremely space efficient copies of data, no matter how large the data set. Use of NetApp's WAFL technology makes it possible to update each snapshot so that only the delta data is copied.

NetApp's SnapMirror® data replication technology can easily create secondary systems for backup, archive or DR, and keeps the copies in sync by efficiently updating data changes only. This works for both on-prem and cloud ONTAP storage endpoints. SnapMirror leverages snapshots as the basis of a block-level transport mechanism for copying data incrementally and efficiently between storage environments, and automatically manages failover and failback in DR scenarios. Archival data from multiple source storage systems can be sent to a single destination system with up to 1023 snapshots that can be stored per volume.

Conventional NDMP (Network Data Management Protocol) backup and restoration can be slow, even when required to recover a single file out of a repository. Cloud Volumes ONTAP can restore a volume or a single file, as needed.

Read/write copies can also be used to instantly restore volumes. This restoration takes place instantly, first by creating a snapshot and then a clone of the snapshot, providing you with a read/write copy ready for use.

In CIFS deployments, Cloud Volumes ONTAP’s snapshots, enable users to see their backups using the native Windows option to restore previous versions, reducing backup admin’s work.

If your database is running on a file share, using SnapCenter® with Cloud Volumes ONTAP provides a way to create consistent backups of the database with quiescence and application-awareness. SnapCenter can also automatically transfer all backup data to different locations to Cloud Volumes ONTAP instance.

The Cost Benefits

When it comes to backup and archiving data, the biggest concern is the cost of storing it. Since the data may not be accessed frequently, it doesn’t make economic sense to keep it stored on performant file systems. But none of the major cloud providers have an automatic and seamless way to tier data from their file service to their object storage offering. That’s where Cloud Volumes ONTAP can make a huge difference in storing file share backup and archive data.

Data tiering with Cloud Volumes ONTAP allows users to offset the costs of keeping backup and archive copies of file data for long-term retention by automatically tiering cold data to object storage on Amazon S3 or Azure Blob. This offers a significant cost-advantage over storing in the file system. If and when the data is required for use, Cloud Volumes ONTAP automatically retrieves the data for use on the SSD-based performance tier.

Comparison Chart

More on File Sharing Services

Using Cloud Volumes ONTAP offers us a fully capable solution for backup and archiving, proven through years of use. We can protect data by taking efficient snapshots and transferring these copies to secondary locations with SnapMirror for backup and for archiving with storage efficiencies and data tiering that help limit the costs for cold data.

View full post