Introduction to AWS CloudWatch

    The public cloud offers many different business benefits such as providing the ability to deploy critical applications that can be highly resilient and scalable across the world easily and efficiently.  Something that’s difficult to implement in the on-premises datacentre.  Whether it’s a ‘lift and shift’ migration of existing workloads to the cloud or new application deployment, monitoring is a key part of operations.

    Amazon Web Services (AWS) offers a service just for this called Amazon CloudWatch.  A move to the cloud can be a big shift in mindset for organisations and administrators, CloudWatch brings a lot of the same sort of monitoring features the administrator expects.

    Amazon CloudWatch can be configured for infrastructure services such as Elastic Cloud computing (EC2) and Elastic Block Storage (EBS) but also for the other AWS services such as Lambda, it can all be automated and configured via API making sure monitoring is added and configured along with the application.  A key part in the operations of cloud computing.

    Amazon CloudWatch offers real time monitoring for resources and running applications.  Alarms can be configured to alert the administrator on defined metrics and changes can be made automatically based on those alarms.  For instance, EC2 instances can be monitored and if a metric such as CPU hits a set threshold, additional EC2 instances can be provisioned to handle the requirements.

    Monitoring

    Amazon CloudWatch out of the box has some powerful metrics without the need for much configuration, further metrics can be configured if required by installing the CloudWatch agent on an instance.  An example of out of the box EC2 instance monitoring includes CPU utilisation but more detailed information around CPU usage requires an agent.  CloudWatch agent can even be used for on-premises Linux and Windows server for the hybrid environments.

    Monitoring can be configured as basic or detailed for instances which will decide how often the metrics are collected, with basic its set to 5 minutes and with detailed its set to 1 minute.  Basic monitoring however, is free of charge, giving the customer flexibility.

    Each AWS service has its own considerations when it comes to CloudWatch, for instance depending on the EBS storage type selected could depend on how often the data is reported – Provisioned IOPS SSD (io1) volumes will send data automatically every 1 minute whereas General Purpose SSD (gp2) volumes send data at 5 minutes.

    AWS Lambda, the logic layer for “serverless” platform, allows the customer to run applications and services without having to provision or monitor the underlying servers.  CloudWatch can still be used to monitor Lambda and collect metrics around the amount of time functions are invoked, the number of failures and duration of the time the function code executes, adding value to be able to predict costs.

    All other key services such as Amazon RDS, Elastic Load Balancer (ELB) and Route 53 to name a few can all send metrics to CloudWatch.  As well as services, billing can be monitored to help control any unwanted costs.

    The full list of services and individual metrics can be found at - Amazon CloudWatch Documentation

    CloudWatch Alarms

    As well as monitoring, alarms can be configured to both alert the operations team but also to automate responses.  Thresholds can be configured for monitored metrics, if the monitored metrics fall outside of the thresholds an alarm will be triggered with one or more actions that can be performed.  Metrics can have multiple alarms configured with multiple actions.

    Actions can include initiation EC2 actions, Auto Scaling actions or send notifications to Amazon SNS topics.  Standard AWS metrics have a standard resolution by default where an alarm period is set in multiples of 60 seconds, custom metrics can be created with high resolution meaning alarms can be configured in multiples of seconds giving a more immediate insight into the application, it does however, come at a higher cost.

    Below is a simple example of an alarm to send an SNS notification when CPU utilisation is above 90% for 2 intervals of the basic monitoring time of 5 minutes.

    Further features include 5000 alarms per region can be created per account and history of alarms are preserved for two weeks.

    CloudWatch Dashboards

    CloudWatch provides the visibility into resource utilisation, operational health and application performance.  Customised dashboards can be created to display multiple metrics and can be accessorised with text and images.  Multiple dashboards can be created along with global dashboards pulling data from multiple regions.

    Dashboards can be created using the AWS console or the AWS CLI and can be integrated into Infrastructure as Code tools such as AWS CloudFormation.

    Text and graph widgets are added to the dashboard to display the metrics depending on the customer’s preference.  As an example, a graph widget can be created that can show CPU utilisation for selected EC2 instances.

    New and existing customers can create up to 3 dashboards using up to 50 metrics with no additional cost, any more than that each additional dashboard comes at a monthly cost.

    CloudWatch Logs

    CloudWatch logs can be used collect and monitor application log files to be monitored in near real-time.  Specific phrases, values or patterns can be used to monitor logs, for instance, an alarm can be configured on the number of errors in the system log.

    Logs can be viewed to troubleshoot any issues and can be stored for as long as required on low-cost storage.

    Further AWS services can be used in conjunction with CloudWatch such as CloudTrail, AWS Identity and Access Management (AMI) and AWS Lambda.

    To collect logs from EC2 instances or on-premises servers, a CloudWatch Logs agent is required however, the same agent used to collect CloudWatch metrics can also be used to collect logs.

    Conclusion

    As with most AWS services, Amazon CloudWatch has evolved and grown into a large service over the last few years.  CloudWatch can be used to monitor metrics on a wide range of AWS services with the ability to create custom metrics where required.  Combining this with notification alarms and automated responses, CloudWatch can become a very powerful tool to the administrator.

    As with all AWS services, CloudWatch can be consumed by code and can be integrated into existing Infrastructure as Code tools and application deployment methods, critical to the operations team.

    Additionally, CloudWatch can be integrated into existing tools such as Splunk giving the operations team further insight into the cloud.

    If you are interested in finding out more about the topics covered within this post or to discuss them with an independent Insight solutions specialist, please contact your insight account manager or get in touch via our contact from.
     

    Why not also read 'Infrastructure As Code'?