Amazon S3 is possibly the most popular cloud-based service available on the market. This is largely because it is an easily-accessible, inexpensive service for data storage. The S3 storage technology is also a platform capable of serving important use cases, providing infrastructure solutions for many companies’ IT needs. But this widespread usage has led to some problems— mainly, negligently unprotected Amazon S3 storage buckets.
Without protection, information stored in open Amazon S3 buckets can be browsed by scripts and other tools. Since the information in the bucket may be sensitive, this poses a critical security risk.
The goal of this article is to raise awareness about the risks of publicly open Amazon S3 buckets, explain what file types this affects most critically, and to teach you how to find those buckets so you can prevent this situation with your own S3 buckets, which may be part of a Cloud Volumes ONTAP deployment on AWS.
Quick links in this article:
Amazon S3 has a broad range of use cases, from backup and website data to Big Data analytics and archive storage; however, there are security risks to having such a large and diverse volume of data all in one place.
What are the security concerns and possible security settings that an admin can give to a bucket and its files?
Amazon S3 is considered a “publicly accessible platform”. What that means is with the right URL and permissions, any bucket can be accessed from anywhere through HTTP requests, such as a normal browser would do to access a website.
Whether or not a bucket is accessible through its URL depends on security measures that may or may not have been enabled, but what we mainly want to stress here is that S3 is not a “hidden” resource that is only accessible after going through more than one tier. On the contrary, it is a resource that is accessible through the AWS endpoints from anywhere on the web, which is its main security risk. Any S3 bucket and all the data it contains is potentially accessible.
Overall, the security checks S3 runs on a request for an S3 resource (bucket or object) to see if it is authorized for access are very solid. S3 verifies permissions at the user level, through bucket policies, bucket ACL's and object ACL's.
Whenever a person or an application wants to write to or read something from an object or a bucket, S3 first checks that the IAM user is authorized by its parent account. Once it confirms that, it then checks bucket policies, bucket ACLs, and object ACLs in order to grant a green pass.
If the user accessing the resource is not an IAM entity, then S3 skips the user level check and proceeds to the bucket policy and the bucket and object ACLs. Through the checkpoints mentioned above, security for buckets, access points, and objects in S3 can be very restrictive. However, not every bucket has these checks enabled.
Since every bucket or object is potentially accessible from anywhere, with the right settings allowing public access a bucket or object can be open to the world. This is where the security risk lies.
Bucket policies and bucket or object ACLs allow you to configure them for access to anyone. Many admins, neglecting this, leave their S3 resources open without knowing they are doing so. Of course, AWS has prompts and warnings that emphasize this point and try to prevent this type of lapse in security, but that hasn’t prevented many occurrences of sensitive data being leaked through this simple error.
Over the years, quite a number of serious data leaks at major companies have taken place due to open Amazon S3 buckets. You can see a list of major S3 data leaks here. Through security lapses such as these, all kinds of sensitive information have been made publicly available, including Social Security numbers, personal photos, sales records, usernames and passwords, medical records, and credit reports. How can open Amazon S3 buckets be found?
An S3 bucket can be accessed through its URL. The URL format of a bucket is either of two options:
So, if someone wants to test the openness of a bucket, all they have to do is hit the bucket's URL from a web browser. A private bucket will return a message of “Access Denied,” and no bucket contents will be shown. With a public bucket, however, clicking the URL will list the first 1000 files contained in that bucket. You need to know the bucket names in advance if you are going to test it that way.
But if you want to surf for publicly available buckets you need a tool running tests against possible bucket names to check if they even exist (a bucket name that doesn't exist will return a “NoSuchBucket” error code). There are a number of different tools for this.
Early S3 bucket search tools include Bucket Finder, S3 Scanner, and S3 Inspector. These tools were useful in that they were able to check for words contained in publicly available bucket names, but there were other issues. Results included many irrelevant buckets, and the contents within those buckets that were displayed were limited to just the first thousand files.
These issues have largely been solved by a newer tool, Grayhat Warfare. Grahat Warfare searches AWS every two weeks and complies a database of all the open buckets there are on AWS. Currently, there are 238,252 open buckets listed. Using this database, Grayhat search results can be returned very quickly, and the results are relevant, showing entire bucket contents.
Grayhat Warfare is basically an online index for open buckets and the files inside of them.
The website offers three different user levels: Free, Registered, and Premium. Depending on which level you are on you have less or more features available to play with:
How does Grayhat Warfare work? For our example we are going to log in as a Registered user and do a search for open buckets with the keyword “payments” anywhere in the bucket name:
The search results returned a total of eight open buckets with the word “payments” in their name:
If we also do a search for the same keyword but this time for available files inside buckets using that word on their naming, we get quite a few results:
As a final tip, one best practice would be to avoid using sensitive terms when you name your S3 buckets. For example, avoid terms such as “Customer Data,” “Credit Card Numbers,” or the like.
Another best practice recommendation is to use one common term in all your bucket names, so that you can always run that name in the Grayhat Warfare search to make sure you haven’t left any buckets unprotected. Again, it would be a good idea in this practice to avoid using revealing information, such as your company name.
When it comes to searching for public Amazon S3 buckets and files, Grayhat Warfare is probably the best tool to use, due to its design and features. If you can identify your open buckets using it, anyone else can too—now is the time to double check your bucket's security settings, and start using AWS S3 Block Public Access configuration to protect your data.
For Cloud Volumes ONTAP on AWS users it’s important to be aware of these bucket security settings. When using Amazon S3 as an archive or when leveraging it as a cold data tier for AWS EBS, your data could become exposed if you haven’t taken the right precautions. Carefully consider the warnings outlined above and take the time now to adjust both your AWS and Cloud Volumes ONTAP security settings and review your AWS backup plans. Find out more about protecting bucket data in this article on S3 encryption.