Amazon Simple Storage Service (S3) is one of the most popular data storage services offered by AWS. Organizations of all sizes can benefit from making the move to S3, from small startups that need to move just a few GBs of data to large enterprises or video storage systems looking to move massive petabytes (PBs) of data.
If your organization wants to move data to S3, you should get to know the various options AWS has to offer, since each one is based on data size.
This article will walk you through the options AWS offers for moving data to S3 so you can find the one that best suits the amount of data that you have to move.
Consider the use case of an organization that has an abundance of documents and images on their hands. In addition, the organization has to maintain all its financial documents as well as employee data for audit and statutory purposes.
This data is measured in the GBs and if the same data is stored in-house, it takes a huge amount of resources to ensure that the data is maintained with the right kind of storage, retrieval, durability, and availability. If the same data is moved to AWS S3 all those issues can be easily solved, at a much lower cost.
When you want to migrate a few GBs of data like this, the ideal tools for you to get to know are S3 CLI, AWS Import/Export service, Storage Gateway, and Transfer Acceleration.
The S3 CLI is a simple but effective migration tool. It begins with creating an AWS Identity and Access Management (IAM is recommended fosync between source and destinationsync between source and destinationr proper access to resources) user identity followed by installing and configuring AWS CLI.
The process is followed by writing custom scripts to call the correct APIs specifying the destination bucket using the SDKs. Multipart Upload APIs will be your best friend in this scenario.
One of the ideal cases for using S3 CLI is when you want to continuously migrate files such as logs and backup data from an application server. For this, you can write automated scripts or even use AWS SDK to upload the data to S3 at a predefined frequency.
In general, S3 also provides you with an option to sync between source and destination.
For more information on how to sync using CLI, click here.
The AWS Import/Export service enables you to transfer your data to S3 by shipping a physical device to AWS. This service is used to transfer data below 16TB and is ideally suited for the following scenarios: Direct Data interchange, off-site backups, and disaster recovery.
For an organization that needs to perform a one-time migration, like in the use case mentioned above, Import/Export would be the right solution.
To get started with AWS Import/Export jobs, click here. However, beginners might find these links about how to import data to S3 and export data to S3 helpful.
The AWS Storage Gateway service is used to tap into the AWS cloud from your on-premises servers, providing you with hybrid cloud storage. It uses a multi-protocol storage appliance with highly efficient network connectivity.
There are a few scenarios when Storage Gateway is used:
In the use case mentioned above, Cached Volumes would be a good solution for the organization because whenever creating a new record, a backup is stored on AWS.
AWS Storage Gateway is an ideal solution for enterprises that want to make use of the hybrid cloud. It is also helpful when enterprise want to completely replace their traditional tape library with the cloud.
Amazon S3 Transfer Acceleration enables quick accelerated secure transfer of files over long distances between your client and your Amazon S3 bucket.
It leverages Amazon CloudFront’s edge locations: as data arrives at an AWS edge location, the data is routed to your Amazon S3 bucket over an optimized network path.
Let’s take a use case of a company that has a huge set of videos available online, including online classes that are available for participants to access at their convenience.
The challenge for this startup is managing to migrate petabytes and exabytes of data.
When you want to migrate hundreds of PBs or Exabytes of data to S3, the ideal tools are Snowball and Snowmobile.
Snowball is a petabyte-scale information transport arrangement that utilizes secure appliances to exchange petabytes of information in and out of the AWS cloud.
Snowball was introduced by AWS to overcome difficulties with vast scale data transfer including high system costs, long exchange times, and security concerns.
It can be as little as one-fifth of the cost of sending data through Internet and is generally used for cloud migration, disaster recovery, data center decommission.
Snowball is also useful when a lot of information is sent between you and your customers, clients, or business partners.
If you want to store more and process that data Snowball-Edge will interest you.
The device has in-built computing and also allows multiple Snowball Edge devices to be clustered together for reasons like durability and sending data in batches.
If you are planning to move your data centers to a different location, then this service is for you. Snowmobile is used to transfer Exabytes of data to AWS.
The Snowmobile itself is a massive semi-trailer truck equipped with a storage capacity of up to 100 PBs of data. The Snowmobile comes directly to your site location, allowing your organization’s network to connect to and copy data onto the Snowmobile equipment.
In a matter of weeks Snowmobile can move an amount of data to a new location that would take years – even decades – to transmit over a hardline.
Of course, all the data is encrypted with AWS Key Management Service (KMS). The service comes with high security features such as dedicated security personnel, GPS tracking, alarm monitoring, 24/7 video surveillance, and an optional escort security vehicle while in-transit
You might want to use S3 to perform data analytics on data which is stored on-premise.
That data has to first be synced with AWS S3 for its services to perform operations and afterwards the results are sent back to your on-premise data center.
If you are an open source enthusiast who likes to build and modify solutions on your own, you can use scripts in combination with tools such as Rclone or rsync.
There is also NetApp’s Cloud Sync, which provides precisely the hybrid cloud storage solutions you require. Cloud Sync synchronizes your data from on-premises or the cloud to S3, using NFS or CIFS shares.
By moving your data quickly and securely to S3 you can now conveniently utilize AWS services like AWS Elastic Map Reduce (EMR), Redshift, and RDS. Cloud Sync will sync your data back to its origin once your results are ready.
Cloud Sync offers a 14 day free trial and get be accessed here.
These days migrating data to cloud and specially to S3 is increasingly popular trend due to various tools AWS offers to migrate data. It all depends on the needs of the organization and the size of the migration.
Sometimes, an organization just has to use its available AWS resource pool to create an optimal solution for data migration, but other times it is necessary to find a technology partner who can make the move for them, such as NetApp’s Cloud Sync.