Amazon Web Services (AWS) monitoring is a set of practices you can use to verify the security and performance of your AWS resources and data. These practices rely on various tools and services to collect, analyze, and present data insights. You can then use these insights to identify vulnerabilities and issues, predict performance, and optimize configurations.
This is part of an extensive series of guides about performance testing.
In this article, you will learn:
There are multiple services and utilities available from AWS that you can use to monitor your systems and access. Some of these tools are included in existing services, while others are available for additional costs.
CloudTrail is a service that you can use to track events across your account. The service automatically records event logs and activity logs for your services and stores the data in S3. Collected data includes user identities, traffic origin IPs, and timestamps. You can view all management events for free for the most recent 90 days. Data events and insights based on your data are also available for an additional fee.
CloudWatch is a service you can use to aggregate, visualize, and respond to service metrics. CloudWatch has two main components: alarms, which create alerts according to thresholds for single metrics, and events, which can automate responses to metric values or system changes.
Certificate Manager is a tool you can use to provision, manage, and apply transport layer security (TLS) and secure sockets layer (SSL) certificates. These certificates are used to prove your services or devices' authenticity and enable you to secure network connections.
EC2 Dashboard is a monitoring tool for the Amazon EC2 virtual machine service. You can use this dashboard to monitor and maintain your EC2 instances and infrastructure. The dashboard lets you view instance states and service health, manage alarms and status reports, view scheduled events, and assess volume and instance metrics
In addition to native tools, many AWS users also adopt third-party tools. These tools are useful for separating monitoring operations from your primary resources and can often provide support for hybrid or on-premises resources as well.
NetApp Cloud Insights is a tool for monitoring that you can use to visualize your infrastructure.It enables you to monitor, optimize, and troubleshoot resources in public and private clouds and on-premises. Cloud Insights includes features for conditional alerting, optimization recommendations, predictive analytics, machine learning based anomaly detection, and compliance auditing.
AppOptics is a tool that you can use to supplement metrics collected by CloudWatch. It enables you to track performance statistics, log trends, and capacity limits. You can integrate AppOptics with other AWS services and generate automatic analyses of your operations. AppOptics also includes features that enable you to monitor multiple AWS accounts from a single interface.
ZenPack is an open source tool you can use to aggregate CloudWatch metrics and external resource metrics data. It includes an easy to use graphical user interface (GUI) and is compatible with a variety of AWS services. These services include S3, Amazon Virtual Private Cloud (VPC), and Amazon Suite.
Zabbix is an open source tool for collecting metrics from AWS and a variety of other applications, services, and databases. It includes features for dashboards, alert escalation, and a robust online community of support. The downside of Zabbix is that it cannot import data or generate performance reports.
Weave Scope is an open source tool you can use to monitor and visualize your microservices. It includes features for service discovery and is compatible with Elastic Container Services (ECS). Weave Scope is based on three components (an interface, an app, and a probe) and enables you to troubleshoot service performance in real time.
Before introducing monitoring into your pipeline or making changes to your existing workflow, you should carefully assess your existing infrastructure, tooling, resources, and skillset. Taking the time to assess your situation can help you develop a strategy that suits your needs.
Step 1: assessment questions
Here are key questions to ask when assessing your AWS monitoring needs:
Step 2: develop a strategy to tag AWS resources
Once you gain insight into your current monitoring needs and prioritize metrics, you can start developing a strategy for tagging AWS resources. Tags help you keep track of your resources, and monitor usage and behavior.
If you don’t have a tagging system in place, it can take some time to figure out how to organize resources. While every project and organization is unique, it is important to create a tagging system that can be used by a wide variety of professionals and collaborators. This way, all relevant parties can gain access to monitoring insights when needed.
After assessing your needs and setting up a tagging system for AWS resources, you can look for the solution that suits your needs. Often, it is effective to start with a simple solution and then expand as needed. However, if you know in advance you need a robust set of features, it’s best to go with a solution that either fits your needs, can be scaled easily, or meets all criteria.
Step 3: start simple with Amazon CloudWatch
CloudWatch metrics can help you monitor practically any AWS resource. CloudWatch provides a wide range of pre-built counters like DiskQueueLength and CPUUtilization. Some AWS services, such as RDS and EC2, can provide additional counters when integrated with CloudWatch.
CloudWatch counters enable you to create dashboards, which you can leverage when you need visualized data. In addition to counters and dashboards, CloudWatch offers an alerting system, which lets you know when incidents occur. If you are not using a dedicated monitoring system, and you need simple features, you can use CloudWatch.
Step 4: leverage best-of-breed solutions
When it comes to visibility, the more resource types you monitor, the more you can ensure the performance and safety of your assets. However, not all monitoring systems can provide visibility for all resources. Some monitoring solutions are designed for infrastructure while others are built for network traffic.
To avoid losing visibility over parts of your environment, you can either use a stack of tools or you can extend the capabilities of existing systems. If you opt to use a stack of monitoring, you might want to first check that the tools provide the features you require and are compatible with each other and your existing stack.
Additionally, you should consider adding a tool to centralize the stack, to ensure productivity remains effective. If you choose to extend existing systems by installing plugins or integrating with APIs, you should enable AWS integration and ensure that each extension is compliant with any regulatory requirements you are legally required to uphold.
Once you set up your monitoring solution or stack, you should decide which logs you want to capture and how you want to set this up. Logs are highly effective for keeping track of compliance requirements and troubleshooting issues.
Here is a list of logs you might want to capture:
The majority of monitoring systems are either suited for metrics or logs, rather than prioritizing both of these tasks equally. To ensure full coverage, you should either use a stack or find a solution that enables you to capture both metrics and logs from AWS.
When monitoring your AWS resources, the following best practices can help you ensure that no resources are overlooked and that you can troubleshoot efficiently.
Production deployments in AWS are typically too large and dynamic to monitor manually. The volume of metrics and log data that is generated is too large for humans to efficiently analyze. To ensure that critical data is not missed and responses are timely, you should use automation to handle most of your monitoring tasks.
Prioritizing monitoring tasks helps ensure that critical services remain operational and that data remains protected. Additionally, prioritizing alerts or alert categories helps ensure that IT teams effectively distribute their time and efforts.
Monitoring data should be used to respond to issues like potential service interruptions proactively. It is much easier to scale resources or throttle traffic in advance than manage a service outage. Additionally, addressing potential issues early on can help you avoid wasted resources and costs.
Cloud environments are flexible and can enable you to experiment with configuration changes without affecting services. When optimizing based on metrics, take time to test your configurations. This way, you can verify if changes are more efficient before implementing them in production.
NetApp Cloud Insights is an infrastructure monitoring tool that gives you visibility into your complete infrastructure. With Cloud Insights, you can monitor, troubleshoot and optimize all your resources including your public clouds and your private data centers.
Cloud Insights helps you find problems fast before they impact your business. Optimize usage so you can defer spend, do more with your limited budgets, detect ransomware attacks before it’s too late and easily report on data access for security compliance auditing.
In particular, NetApp Cloud Insights lets you automatically build topologies, correlate metrics, detect greedy or degraded resources, and alert on anomalous user behavior.
Start a 30-day free trial of NetApp Cloud Insights. No credit card required
This article explains what AWS monitoring best practices are, how monitoring in AWS works, and highlights 6 best practices for ensuring effective monitoring in AWS.
This article explains what AWS monitoring dashboards are, the components of a dashboard, provides two tutorials for creating dashboards, and highlights some best practices.
This article explains what CloudWatch monitoring is, how CloudWatch works, some key concepts to know in CloudWatch, and highlights a few metrics to watch for EBS and EC2.
This article explains what CloudWatch Logs Insights is, how to get log data to the service, what the syntax for queries is, and how to perform a sample query.
Read more: CloudWatch Log Insights: Ultimate Quick Start Guide
In this article you’ll learn how to find underperforming resources in EBS, how to evaluate your resource use, and how to apply metrics to improve your resource efficiency.
Read More: Monitoring the Costs of Underutilized EBS Volumes
Monitoring cloud environments can be quite different than on-premises ones. These environments are dynamic, highly distributed, and inherently more vulnerable to cyber threats. To ensure that you are applying the proper strategies when monitoring your cloud resources, it is important to make sure you are following best practices.
This article explains what AWS monitoring best practices are, how monitoring in AWS works, and highlights 6 best practices for ensuring effective monitoring in AWS.
Read More: 5 AWS Monitoring Best Practices You Must Know
Together with our content partners, we have authored in-depth guides on several other topics that can also be useful as you explore the world of performance testing.
Authored by Granulate
Authored by Granulate
Authored by Lumigo