As the version control repository of choice for many thousands of software development teams, GitLab provides a unified platform for managing source code, executing Continuous Integration/Continuous Delivery (CI/CD) pipelines, issue tracking, sharing wiki-based project documentation, and much more. The majority of these development teams cannot function without ready access to GitLab services.
Being so vital to the software development lifecycle makes it imperative to ensure that GitLab is always available, and as the data being stored by GitLab is the intellectual property of an organization, protecting this data against localized or site-wide storage failures is of paramount concern. Deploying GitLab storage using NetApp’s Cloud Volumes ONTAP makes all of this very easy to achieve.
In this article we will look at the pros and cons of the various storage options for deploying GitLab HA in AWS, and show how the enterprise data management features of Cloud Volumes ONTAP make it the ideal solution.
GitLab is an all-in-one solution for developing software, including Git source control, merge requests, CI/CD pipelines, issue tracking, wikis, and much more. Due to its centrality in the software development process, access to GitLab is vital, with performance being a prime concern. For large teams, this usually means scaling the deployment across many servers.
GitLab supports both single and multi-node deployments that can process hundreds of concurrent requests per second and handle thousands of active users. In a single node environment, all of the constituent services that are used by GitLab, such as PostgreSQL, Redis, and Consul, will all run on the same machine. To scale for a larger user base of twenty thousand or fifty thousand users, multiple nodes can be used for each of these services.
For GitLab storage there are a number of different options available.
It should be kept in mind that data access in a GitLab cluster is I/O intensive for both reads and writes, and so a high-performance solution is required at this level to deliver the best user experience. As well as storage for Git repositories, GitLab storage is also required for media files, uploads, and build artifacts, which are supplemental to the source code. GitLab’s recommendation here is to store this data using some form of object storage.
In this section, we will explore the available options for managing your GitLab data, and will review performance characteristics, features, and support for GitLab high availability.
GitLab administrators can deploy data storage for source code repositories and other data by means of block storage, such as Amazon EBS, which would be either directly mounted to the GitLab server, or accessed via the Gitaly service on a separate machine. This provides flexibility in terms of the type and performance of the disks used, such as General Purpose SSD (gp2) or Provisioned IOPS SSD (io1). Also, Amazon EBS storage is redundant within an Availability Zone, and so there is built-in redundancy within the provisioned storage.
Gitlaly does not currently support high availability, which can be a limiting factor when deploying GitLab HA. It should also be noted that while Amazon EBS storage is redundant within an Availability Zone, to extend this redundancy across Availability Zones, or across regions, would require some form of custom replication.
Amazon S3 storage provides cost-effective and highly durable object storage within the AWS cloud, making it an ideal solution for many different types of data storage, such as backups, media files, data archives, etc. Though GitLab can also make use of Amazon S3 for these types of files, it cannot be used for GitLab’s main data stores—i.e., the source code repositories—which means that another solution must be used alongside Amazon S3 for this purpose.
The use of NFS storage is both widespread and widely understood, making it a very good option for deploying shared data storage. As the same file system can be mounted by multiple servers concurrently, NFS makes it easy to scale out and support high availability. GitLab can make direct use of NFS storage for both repository data and object storage.
Ensuring that your NFS environment meets both the strict performance requirements of GitLab as well as providing high availability, can make it prohibitively complex to roll out your own NFS server cluster. Alternatively, using services such as Amazon EFS is strongly discouraged by GitLab due to performance issues. In order to make use of NFS storage, you would need an enterprise-grade, highly available, and high-performance NFS solution in the cloud, which is precisely what NetApp’s Cloud Volumes ONTAP delivers.
Using NetApp Cloud Volumes ONTAP, you can take advantage of NetApp storage solutions within AWS, Azure, or Google Cloud, building on the native compute and storage resources provided by each cloud environment. This substantially improves the flexibility, performance, high availability, and cost effectiveness of deploying cloud storage over using native cloud storage resources directly.
NetApp is a recognized industry leader for NFS solutions, and using Cloud Volumes ONTAP allows you to easily deploy GitLab on this very same technology. In AWS, you can choose any of the available Amazon EBS disk types for your new NFS shares, and even combine them together within a RAID group for extra performance. Cloud Volumes ONTAP can also be deployed on a variety of different Amazon EC2 instance types, including General Purpose, Compute Optimized, and Memory Optimized.
Cloud Volumes ONTAP contains many features that help support performance, data protection, high availability, and DevOps benefits for a GitLab deployment:
Building out a highly scalable GitLab HA deployment with multiple GitLab application servers requires an enterprise-grade shared storage solution. Whereas other NFS solutions, such as Amazon EFS, can challenge the strict performance requirements of GitLab, NetApp’s Cloud Volumes ONTAP is a remarkably good fit for this use case, and provides many other ancillary benefits, in terms of data protection and high availability.
To find out more about Cloud Volumes ONTAP hands on, start a 30-day free trial today.