Azure Table storage is a cloud-based NoSQL datastore you can use to store large amounts of structured, non-relational data. Azure Table offers a schemaless design, which enables you to store a collection of entities in one table. An entity contains a set of properties, and each property defines a name-value pair. Azure Table is a lightweight, easy to manage service, making it ideal for users just getting started with NoSQL or cloud data services.
In this post, we’ll explain main Azure Table concepts, and provide a comparison between the regular Azure Table service and the new Azure Cosmos DB Table API, as part of the Azure file storage ecosystem. We’ll also show how Azure NetApp Files can help you to migrate more applications to Azure–even your business-critical workloads–with extreme file throughput with sub-millisecond response times.
This is part of a series of articles about Azure storage.
In this article, you will learn:
Within Table storage, you can store metadata and flexible datasets. It enables you to store an unlimited number of entities and each storage account you use can contain as many tables as can fit in your storage space.Azure Table storage is a database you can use to store NoSQL data in Azure. It enables you to store structured, schemaless data using a key/attribute design. You can also use it for structured, non-relational data.
You can also use Azure Table storage through the Azure Cosmos DB Table API. This is a premium service that provides automatic secondary indexes, global distribution, and throughput-optimized tables.
Azure Table storage is built on the following components:
Component | Description |
URL format |
When using Azure tables, you can access data directly through the following addresses. This access is based on the OData protocol. Azure Table storage: http://<storage account>.table.core.windows.net/<table>
|
Accounts |
When accessing Azure Table storage directly, everything is managed through your storage account. When using Cosmos DB, access is managed through your Table API account. |
Table | Tables operate the same for both Azure Table storage and the Table API. These tables are collections of entities without schemas. This enables you to store multiple entities within a table with different property sets. |
Entity | Entities are sets of properties and can be thought of like rows in a database. When using Azure Table storage, you can have entities up to a size of 1MB and in Cosmos DB you can have entities up to a size of 2MB. |
Properties | Properties are name-value pairs within entities. You can store up to 252 properties per entity. Entities also contain three system properties that define a timestamp, a RowKey, and a PartitionKey. RowKey are unique identifiers but PartitionKey are not unique. When entities share PartitionKey, you can query more quickly and can insert or update data with atomic operations. |
While Cosmos DB Table API and Azure Table storage can both provide similar functionality, the two services are not identical. Below you can learn how these services differ and the capacities of each.
Performance
When using Azure Table storage there is no upper bound on the latency of your operations. In contrast, Cosmos DB limits read/write latency to under 10 milliseconds.
With Azure Table, your throughput is limited to 20k operations per second while with Cosmos DB throughput is supported for up to 10 million operations per second. Additionally, Cosmos DB provides automatic indexing of properties. This can be used during querying to increase performance.
Global distribution
You can use Azure Table in a single region with a secondary, read-only region for increased availability. In contrast, with Cosmos DB you can distribute your data across up to 30 regions. Automatic, global failover is included and you can choose between five consistency levels for your desired combination of throughput, latency, and availability.
Consistent API
You can use the same API with both Azure Table and Cosmos DB. There are also software development kits (SDKs) available for use with a generic REST API. However, with Cosmos DB a superset of functionality exists that you can use for additional methods. Because the API is shared, you can easily transfer data between Azure Table and Cosmos DB.
Billing
Billing in Table storage is determined by your storage volume use. Pricing is per GB and affected by your selected redundancy level. The more GB you use, the cheaper your pricing. You are also charged according to the number of operations you perform, per 10k operations.
Billing in Cosmos DB is determined by the number of throughput request units (RUs). Your database is provisioned in increments of 100RU per second and you are billed hourly for any units used. You are also billed for storage per GB at a higher rate than Table storage.
When using Table storage, there are a few tips you can apply to optimize your performance. These tips can help offset some of the performance limitations in comparison to Cosmos DB, enabling you to select the cheaper of the two options.
Targets for data operations
If you are expecting increased traffic to your Azure Table database, try to ramp up slowly whenever possible. Although the service automatically load balances, sudden bursts of traffic may cause lag. Scaling with Table storage is not immediate and your workloads may experience timeouts or throttling while load balancing is adjusted.
Network throughput
When accessing Table storage from on-premises applications or applications requiring high throughput the limitation is often with your client. To avoid this, you can select larger Azure instances or use clustered machines. This can provide greater network capacity.
Location
To minimize latency, you should try to place your client and database in the same region in Azure. This has the added benefit of eliminating bandwidth costs since data transfers within a region are free.
If you are using applications that are hosted outside of Azure, try to store your database in the closest region to where the applications are hosted. For distributed apps, you may want to consider using multiple storage accounts, one per region of distribution. This method works best if your data is regionally unique, for example, only required by users within a specific region.
Unbounded parallelism
Parallelism can help you improve Azure Table storage performance but you should take care to establish limits on the number of threads you allow. This includes limiting requests for both download/upload data, access to multiple items in a single partition, or multiple partitions in the same account.
Putting limitations on parallelism helps you prevent your client from exceeding its capabilities and your storage account from meeting scaling limitations. It also reduces your chances of experiencing throttling or increased latency.
Client libraries and tools
To ensure performance, try to use the latest tools and client libraries provided by Azure. This includes tooling provided for Azure CLI and PowerShell. Azure libraries are specifically designed to improve performance and are sure to be up to date with your service’s latest version.
Azure NetApp Files is a Microsoft Azure file storage service built on NetApp technology, giving you the file capabilities in Azure even your core business applications require.
Get enterprise-grade data management and storage to Azure so you can manage your workloads and applications with ease, and move all of your file-based applications to the cloud.
Azure NetApp Files solves availability and performance challenges for enterprises that want to move mission-critical applications to the cloud, including workloads like HPC, SAP, Linux, Oracle and SQL Server workloads, Windows Virtual Desktop, and more.
In particular, Azure NetApp Files allows you to migrate more applications to Azure–even your business-critical workloads–with extreme file throughput with sub-millisecond response times.
Please schedule time to speak with one of our specialists if you have specific questions.
More articles that may interest you: