The Top 5 Metrics You Need to Understand the Health of Your Applications

  • August 11, 2021

You’ve made an application, and you’re ready to launch. But do you know how healthy it is? You may think that you have followed all the necessary processes and procedures required to keep your new application safe and secure. You may be monitoring your system daily and think that is more than enough to keep your infrastructure and applications healthy. However, if you have not introduced Observability into your system from the beginning, you may be at risk of missing a serious weaknesses in your application. 


Observability is what allows you to truly understand what is going on inside of your system. It is a critical property of your system that externalizes the internal state. You want to know what is happening and why in your infrastructure and application in the cloud from end to end. It helps you detect problems early before they are noticeable to your users. Previously, we discussed the benefits of Observability for your infrastructures, such as combating internal latency, too many locks on the system, service degradation, bugs, (un)-authorized activity, and more. These benefits can be extended to your applications when you implement Observability for SaaS. 


The Top 5 Metrics

Here are the top five metrics you should be using to assess the health of your applications. 


1. Response Time: 

HTTP is the most widely used protocol on the web, and it’s based on a request-response pattern. Your application will almost definitely be talking to the web. The response time is critical, and your users will notice if it lags. Measuring the response time of your application, especially on the edge, will help you understand your users’ perceptions.



2. HTTP 5xx Errors: 

5xx means your application is crashing at some point. This usually indicates a problem with the server. A low number of errors could be ok, but it’s a metric to pay attention to and diagnose when it goes up.



3. Error Strings Count in Error Logs: 

Applications will log different strings. You may get error strings for a variety of reasons. It’s important to identify those strings, detect patterns, and count when they appear in the application logs. Too many errors mean you need to start debugging what’s going on.


4. Restarts: 

These days, applications run in containers on the top of container orchestrators like Kubernetes. This orchestrator will take care of restarting applications if they crash, but you should count these events. If they are happening very often or increasing in frequency, this may signal an issue, and you may want to potentially fix something.


5. CPU and Memory: 

It is essential to monitor CPU and Memory consumption continuously. They are two of the most basic indicators of potential issues.

When you are aware of the metrics you need to track to implement Observability, you can use a dashboard tool to visualize them. It is important to log this information and keep it in a centralized location. This will allow you to track and identify any problems as they arise. Once you implement these metrics, you will be able to determine how healthy your application really is and ensure it is operating at its full potential. 


Written by: Gabriel Vasquez
General corrections and edition: Diego Woitasen