Observability: Optimize workloads with Universal Profiling
Overview
Introduction to Elastic Observability
Get more familiar with Elastic Observability as well as an overview on how to ingest, view, and analyze customer logs from your applications using Elastic Cloud. Learn how you can modernize applications and adopt the cloud with confidence.
Let's get started
Create an Elastic Cloud account
Once you go to cloud.elastic.co and create an account, follow this video to learn how to launch your first Elastic stack in any one of our 50+ supported regions globally.
Once your deployment is complete, under the Observability tab, select Optimize my workloads with Universal Profiling.
Now you’ll be prompted to add your data to get started. Select Set up Universal Profiling.
If this is your first time using the Universal Profiling Agent, you'll be prompted to set it up. Simply follow the instructions below.
Below is an example of running the above commands in the Microsoft Azure AKS cluster.
Once data begins to show, navigate to Stacktraces under Universal Profiling in the left menu. Viewing the stack traces is about seeing what's consuming the most time. Hover your mouse cursor over the chart to see the wave pattern for the individual threads.
The stacktraces view shows grouped stacktrace graphs by threads, hosts, Kubernetes deployments, and containers. It can be used to detect unexpected CPU spikes across threads and drill down into a smaller time range to investigate further with a flamegraph.
You'll start seeing data in about 3 minutes or less. Check out this blog for more information on how to read stack traces.
Working with Elastic Observability
Analyze Flamegraphs
Next, navigate to Flamegraphs under Universal Profiling in the left menu. Essentially, profiling is synonymous to Flamegraphs. It represents, as you read from left to right, what the most expensive code is or the most expensive function.
The flamegraph page is where you will most likely spend the most time, especially when debugging and optimizing. We recommend that you use this blog to identify performance bottlenecks and optimization opportunities with flamegraphs. The three key elements-conditions to look for are width, hierarchy, and height.
- Scan horizontally from left to right, focusing on width for CPU-intensive functions.
- Examine vertically to examine the stack and spot bottlenecks.
- Look for towering stacks to identify potential complexities in the code.
To start exploring, it's recommended to limit it to a specific thread, host, deployment or container. Simply enter it in the search bar.
NOTE: Elastic Universal Profiling is the only continuous profiling solution in the industry that provides mixed-language visibility from the kernel to native code to the high-level programming languages without requiring debug symbols on the host.
As you analyze the graph take note that the longer the line the more time it's taking in terms of CPU time. If you select one of the lines, you'll get a flyout with even more details. Function is the line of code that was executed at the time, you'll also see the other key details such as the Total CPU, Annualized CO2 and Annualized dollar cost.
Compare code before and after changes
Differential flamegraphs allows you to compare code before an after changes before pushing to production. Teal represents improvement and red represents regression.
In the image below, you see the optimized container is better based on the color.
If you select the dropdown arrow by Gained overall performance you can see the overall improvements values.
Next, if you select the Swap sides icon (the icon between the containers being compared with arrows pointing in opposite directions. You see reverting back to the code for the container prior to optimizing will result in a regression.
If you select the dropdown arrow by Lost overall performance you can see the overall regression values.
Next, if you select Go to monitor, you’ll immediately get some high level insight. These charts will start to render as more tests come through but you can quickly see the availability, the duration to execute tests, the timeline, and you can also drill into the waterfall chart. To drill in click the icon under View test run.
Next steps
Thanks for taking the time to collect and analyze logs with Elastic Cloud. If you're new to Elastic, be sure to spin up a free 14-day trial.
Also, as you begin your journey with Elastic, understand some operational, security, and data components you should manage as a user when you deploy across your environment.
Observability resources
- Explore observability demo gallery
- Get started with collecting and analyzing your logs
- Get started with monitoring your application performance (APM/tracking)
- Get started with monitoring your hosts
- Get started with monitoring Kubernetes clusters
- Get started with creating a synthetic monitor