Building Modern Python Applications on AWS - week five

This is week five course notes of Building Modern Python Applications on AWS on Coursera.

AWS X-Ray Terminology

AWS X-Ray is a service that collects data about requests that your application serves, and provides tools you can use to view, filter, and gain insights into that data to identify issues and opportunities for optimization. For any traced request to your application, you can see detailed information not only about the request and response, but also about calls that your application makes to downstream AWS resources, microservices, databases and HTTP web APIs.

For general information on AWS X-Ray click here: https://docs.aws.amazon.com/xray/latest/devguide/aws-xray.html

Segments

The compute resources running your application logic send data about their work as segments. A segment provides the resource’s name, details about the request, and details about the work done.

Subsegments

A segment can break down the data about the work done into subsegments. Subsegments provide more granular timing information and details about downstream calls that your application made to fulfill the original request. A subsegment can contain additional details about a call to an AWS service, an external HTTP API, or an SQL database. You can even define arbitrary subsegments to instrument specific functions or lines of code in your application.

Service graph

X-Ray uses the data that your application sends to generate a service graph. Each AWS resource that sends data to X-Ray appears as a service in the graph. Edges connect the services that work together to serve requests. Edges connect clients to your application, and your application to the downstream services and resources that it uses.

Traces

A trace ID tracks the path of a request through your application. A trace collects all the segments generated by a single request. That request is typically an HTTP GET or POST request that travels through a load balancer, hits your application code, and generates downstream calls to other AWS services or external web APIs. The first supported service that the HTTP request interacts with adds a trace ID header to the request, and propagates it downstream to track the latency, disposition, and other request data.

Annotations and metadata

When you instrument your application, the X-Ray SDK records information about incoming and outgoing requests, the AWS resources used, and the application itself. You can add other information to the segment document as annotations and metadata. Annotations and metadata are aggregated at the trace level, and can be added to any segment or subsegment.

Annotations are simple key-value pairs that are indexed for use with filter expressions. Use annotations to record data that you want to use to group traces in the console, or when calling the GetTraceSummaries API.

X-Ray indexes up to 50 annotations per trace.

Metadata are key-value pairs with values of any type, including objects and lists, but that are not indexed. Use metadata to record data you want to store in the trace but don’t need to use for searching traces.

The X-Ray Daemon

The AWS X-Ray daemon is a software application that listens for traffic on UDP port 2000, gathers raw segment data, and relays it to the AWS X-Ray API. The daemon works in conjunction with the AWS X-Ray SDKs and must be running so that data sent by the SDKs can reach the X-Ray service.

When using AWS X-Ray, your code isn’t uploading traced directly to the X-Ray service. Instead, it sends information to the X-Ray Daemon which then uploads the information in batches to the X-Ray service.

If you are running code on AWS Lambda, the X-ray Daemon is installed and managed for you. You can simply just start using the X-Ray SDK. If you are running X-ray outside of Lambda you may need to install and manage the daemon yourself. Read more information about how to install and setup the X-ray Daemon at: https://docs.aws.amazon.com/xray/latest/devguide/xray-daemon.html

How to use X-Ray in Python Applications

The X-Ray SDK for Python is a library for Python web applications that provides classes and methods for generating and sending trace data to the X-Ray daemon. Trace data includes information about incoming HTTP requests served by the application, and calls that the application makes to downstream services using the AWS SDK, HTTP clients, or an SQL database connector. You can also create segments manually and add debug information in annotations and metadata.

For more information on the AWS X-Ray SDK for Python click here: https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-python.html

CloudWatch Logs Terminology

It’s helpful to remember CloudWatch Logs terminology when working with your logs in AWS.

The terminology and concepts that are central to your understanding and use of CloudWatch Logs are described below.

Log events

A log event is a record of some activity recorded by the application or resource being monitored. The log event record that CloudWatch Logs understands contains two properties: the timestamp of when the event occurred, and the raw event message. Event messages must be UTF-8 encoded.

Log streams

A log stream is a sequence of log events that share the same source. More specifically, a log stream is generally intended to represent the sequence of events coming from the application instance or resource being monitored.

Log groups

Log groups define groups of log streams that share the same retention, monitoring, and access control settings. Each log stream has to belong to one log group.

Metric filters

You can use metric filters to extract metric observations from ingested events and transform them to data points in a CloudWatch metric. Metric filters are assigned to log groups, and all of the filters assigned to a log group are applied to their log streams.

Retention settings

Retention settings can be used to specify how long log events are kept in CloudWatch Logs. Expired log events get deleted automatically. Just like metric filters, retention settings are also assigned to log groups, and the retention assigned to a log group is applied to their log streams.

AWS Lambda and CloudWatch Logs

Configuring Lambda to send logs to CloudWatch Logs isn’t a difficult task. In fact, the only step is to make sure that Lambda is allowed to create a Log Group, create a Log Stream and put Log Events. This can easily be done by using the IAM managed policy “AWSLambdaBasicExecutionRole”. Then, it’s only a matter of using the print method or any logging library that writes to stdout or stderr in your code. You can find more information on how to use it here:

https://docs.aws.amazon.com/lambda/latest/dg/python-logging.html

Amazon API Gateway and CloudWatch Logs

Configuring API Gateway to send logs to CloudWatch isn’t difficult either. You configure this at the stage level.

There are two types of API logging in CloudWatch: execution logging and access logging. In execution logging, API Gateway manages the CloudWatch Logs. The process includes creating log groups and log streams, and reporting to the log streams any caller’s requests and responses.

The logged data includes errors or execution traces (such as request or response parameter values or payloads), data used by Lambda authorizers (formerly known as custom authorizers), whether API keys are required, whether usage plans are enabled, and so on.

Read more about logging and API Gateway here: https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-logging.html