Many companies that would like to migrate applications from on-premises to the cloud are hindered by the differences in capabilities offered by on-premises storage systems and cloud storage services. That gap has made migrating enterprise applications such as Microsoft SQL Server much more problematic. In particular, gaps in the services needed to run an enterprise application such as robust snapshots, storage efficiency capabilities, high availability, reliability, and consistent performance have forced customers to make design trade-offs or forgo application migration. With FSx for NetApp ONTAP, customers no longer need to compromise. FSx for NetApp® ONTAP® is a native, first-party AWS service sold, supported, billed, and fully managed by AWS. It uses the power of NetApp ONTAP to provide the same enterprise-grade storage and data management capabilities NetApp has provided on-premises for three decades in AWS as a managed service.
FSx for ONTAP has two types of architecture: multi-availability zone (AZ) or single-AZ deployment. As the name suggests, multi-AZ deployment offers 0 RPO across availability zones in a region, and a single-AZ deployment is a highly available offering within a single AZ in a region. FSx for ONTAP multi-AZ file systems provide a simple, highly available storage solution for operating databases that requires automatic and seamless failover between AZs. For example, a SQL server failover cluster (FCI) using shared storage is a good candidate for multi-AZ deployment so that there is no need to use third-party disk replication mechanisms.
In this blog, we provide an overview of the benefits of using FSx for ONTAP along with SQL Server on EC2 and how to use NetApp Snapshot™ technology to create database backups and perform granular restores in minutes. We focus on using NetApp SnapCenter® software (application-consistent database backup management) with SQL Server on EC2 with Amazon FSx for ONTAP covering a wide range of use cases, including data protection, high availability, and test and development.
Amazon FSx for NetApp ONTAP is the ideal file storage for SQL server deployments in AWS. Benefits include the following:
- Consistent high performance and throughput with low latency
- Intelligent caching with an NVMe cache to provide additional performance
- Flexible sizing so that you can increase or shrink capacity, throughput, and IOPs on the fly
- Efficient on-premises-to-AWS block replication
- The use of iSCSI, a well-known protocol for the database environment
- Storage efficiency features like thin provisioning and zero-footprint clones
- Backup time reduction from hours to mins, thereby reducing the RTO
- Granular backup and recovery of SQL databases with the intuitive NetApp SnapCenter UI
- The ability to perform multiple test migrations before actual migration
- Shorter downtime during migration and overcoming migration challenges with file-level or I/O-level copy
- Lowering cost by downsizing the EC2 instance and overcoming instance storage limits
- Reducing mean time to repair (MTTR) by finding the root cause after a major release or patch update
Deploying SQL databases on FSx ONTAP with the iSCSI protocol, as is commonly used on-premises, provides an ideal database storage environment with superior performance, storage efficiency, and data-management capabilities. Using multiple iSCSI sessions (assuming a 5% working set size) and fitting a FlashCache delivers over 100K IOPs with the FSx ONTAP service. This configuration provides complete control over performance for the most demanding SQL environments. SQL Server running on smaller EC2 instances connected to FSx for ONTAP can perform the same as SQL Server running on a much larger EC2 instance, because only network bandwidth limits are applied against FSx for ONTAP. Reducing the size of instances also reduces the compute cost, which provides a TCO-optimised deployment. The combination of SQL using iSCSI and FSx for ONTAP provides great advantages for SQL workloads.
Now let’s look at the use cases that improve and extensively simplify every database administrator’s operational efficiency and day-to-day management tasks.
Use case 1: Back up the database using SnapCenter.
Executing backups is a critical part of any data-protection strategy. NetApp SnapCenter provides complete GUI-based data protection of the entire database. SnapCenter can automate all backup and restore operations, and it can also automate the cloning process and manage replication for disaster recovery to another region. After placing the database and log files on the FSx for ONTAP LUNs, SnapCenter can be used to back up the databases.
The time needed to create a Snapshot copy is independent of the size of the database because a Snapshot copy does not move any physical data blocks on the file system. In addition, the use of Snapshot technology has no performance effect on the live databases. Therefore, the creation of Snapshot copies can be scheduled without considering peak activity periods. Customers typically schedule multiple online Snapshot backups during the day; for example, every six hours is common. With this approach, a 1.25TB drive can be backed up in less than a minute. This can be followed by a 15min log backup to reduce the overall RTO for recovery. The largest part of the overall backup/restore workflow runtime is the time needed to execute the quiescing or disk attach operation respectively, and this step is dependent on the load on the SQL server. The volume Snapshot backup itself always finishes in a couple of seconds.
Complete the following high-level steps to create application-consistent backups from databases running on FSx for ONTAP. This process protects your databases for quick recovery while seamlessly replicating database volumes onto another FSx for ONTAP filesystem running in another region for disaster recovery purposes.
Note: Follow the instructions found at Install the SnapCenter Server and Installing SnapCenter Plug-in for Microsoft SQL Server to install and set up SnapCenter and the associated SQL server plug-in. After you are done, complete the following steps to create application-consistent backups by creating an appropriate policy based on your SLAs.
- Log into the SnapCenter GUI.
- In the left navigation pane, click Settings.
- In the Settings page, click Policies.
- Click New.
- On the Name page, enter the policy name and description.
- On the Backup Type page, choose the backup type (full and log backup, full backup, or log backup), select the schedule type for policy (on-demand, hourly, daily, weekly, or monthly), choose the retention settings, update the replication settings, and update the script if a pre-script or post-script needs to be run.
- Click finish.
After you are finished, trigger database and log or appropriate resource-group backups with NetApp SnapCenter using the previously created backup policies. This triggers a SnapMirror update too. The volumes need to be configured for SnapMirror updates for SnapCenter to push application-consistent Snapshot copies.
In this blog post, the Backup Now option is used for demonstration.
This screenshot covers the job details for the addition of a host (SQL plug-in), the automated discovery of SQL databases, the creation of policy, and finally the full backup of demodb. As mentioned earlier, there is no data copy, no effect on the resources of the SQL Server instance, and the backup is completed in minutes. This allows you to perform multiple backups in a single day.
- Verify that the job is successful and completed.
If necessary, adjust the backup policies and replication schedule to reduce the time between jobs.
Note: NetApp recommends that you perform thorough testing prior to production deployment.
Use case 2: Restore the database using SnapCenter.
The SnapCenter restore process is quick and, depending on the database layout, automatically uses an efficient mechanism to recover from a backed-up SQL Server database by copying all the data and log files. NetApp SnapRestore® technology enables the restoration of a database to any point in time, based on currently available Snapshot copies. The restore process is finished in a few seconds, independent of the size of the database. Because several online Snapshot backups can be created during the day, the time needed for the recovery process is significantly reduced relative to a traditional once-per-day backup approach. Because you can perform a restore with a Snapshot copy that is at most only a few hours old (rather than up to 24 hours), fewer transaction logs must be applied during forward recovery. Therefore, the RTO is reduced to several minutes rather than the several hours required for conventional streaming backups.
- In the left navigation pane, click Resources.
- Select the database or resource group from the list. The Topology page is displayed.
- From the Manage Copies view, select Backups from FSx for ONTAP filesystem.
- Select the backup from the table, and then click the restore icon.
In the Restore Scope page, select:
Restore scope: Restore the database to the same host where the backup was created.
Recovery type: All log backups.
Pre-restore option: Overwrite the database with same name during restore.
Post-restore option: Operational, but unavailable for restoring additional transaction logs.
This demonstration uses the settings indicated above. Based on the SLA and RPO requirements, choose the appropriate options for your system.
The restore process happens quickly and in a granular fashion.
Note: The process is blazingly fast irrespective of the size of the database, thanks to the FSx for ONTAP SnapRestore capabilities.
Note: Make sure that there is sufficient space available in the volume before performing a restore operation.
Use case 3: Test recovery and create multiple instant copies for DevOps.
With the NetApp SnapCenter clone workflow, you can bring up SQL databases nearly instantly in AWS and verify the cloned database for consistency and data integrity. This is a great way for DevOps team to quickly verify how things will work with a patch rollout without consuming additional storage space due to the efficiency of FlexClone capability in FSx for ONTAP. Cloning and shutting down the database is a quick and easy process that you can perform as often as needed. This is completely isolated from the production environment, which means production access continues as normal while validation efforts progress on the other side. Clones can also be created using crash-consistent storage Snapshot copies.
- Use the clone workflow in NetApp SnapCenter with the latest log backup for multiple copies.
- Select the database.
- From the Manage Copies page, select the backup from either the primary or secondary (mirrored) file system.
- Select the backup, and then click Clone.
- On the Clone Options page, update the clone settings by providing the Clone server, Clone instance, Clone name, and the appropriate mount path.
- On the Logs page, select the appropriate option.
After you are done, monitor the operation’s progress by clicking Monitor > Jobs.
The previous screenshot covers cloning to the same SQL server EC2 instance.
The previous screenshot shows cloning to alternate SQL server EC2 instance.
The process is as simple as that. This is the capability that AWS and NetApp bring to the table: a reduction in development and test cycles, an improvement in the time to market, less use of storage capacity, and a minimization of administrative and management tasks.
Use case - 4: Perform disaster recovery and migration with SnapMirror
Whether you are running NetApp ONTAP on-premises or you are already using FSx for ONTAP in AWS, SnapMirror enables block-level data replication that keeps the destination file system up to date through incremental updates. SnapMirror uses NetApp Snapshot technology as the basis for its replication process, which first creates a full copy of the data and subsequently replicates only the delta, saving on transfer and storage costs. This is widely used by customers for disaster recovery purposes. Now, with FSx for ONTAP, this capability simplifies disaster recovery, migration, development, and testing from on-premises to FSx for ONTAP or between two file systems between two VPCs.
Setting up replication is the first step in executing the disaster recovery plan. Following the high-level steps described below allows operations to be available from on-premises while seamlessly replicating data into AWS FSx ONTAP.
- Make sure that the ONTAP system running on-premises can communicate with Amazon FSx for ONTAP.
Use NetApp BlueXP to create a SnapMirror relationship for the database and log volumes with the appropriate policy.
Note: The simulation covered in this blog post was performed between on-premises and the AWS virtual private cloud.
- Trigger the database and log or appropriate resource group backups with SnapCenter using the appropriate backup policies. This step also triggers a SnapMirror update.
In the event of disaster, break the mirror and attach the LUN from the target FSx for ONTAP file system to the EC2 instance running SQL Server.
Note: This can also be achieved using SnapCenter to perform disaster recovery of the SnapCenter plug-in for SQL Server.
The same SnapMirror replication can also be used for migration and validation by using the cloning capability in Use Case 3. This enables you to perform a thorough validation of a database before rolling out a patch without affecting the production database. You can also perform a thorough validation before performing a migration cutover from on-premises.
To summarize, Amazon FSx for ONTAP along with a SQL Server database enables you to run production database in AWS with high performance, data protection, and block-based disaster recovery capabilities, allowing you to avoid what used to be a long and cumbersome process.
In this blog, we provided an overview of FSx for ONTAP as a storage option for SQL Server running on an EC2 instance. We also demonstrated that FSx for ONTAP and SnapCenter simplify data protection. With an understanding of these simple workflows, you can better plan a simple, easy, and successful deployment of SQL Server on Amazon FSx for NetApp ONTAP.