Observability in Microservices World
High Level Overview of Observability Patterns in Microservices and how it is different from Monitoring
Definition
Observability is the process of understanding what happens in a distributed system by helping collect, measure and analyze signals from a micro-service. A micro-service in itself isn’t difficult to debug and we can start understanding the requests flowing through it by looking at it’s logs but in a world where multiple micro-services interact with each other, it becomes a tedious job to understand how a request flows in the system and if there is any issue, how to debug it.
Pillars of Observability
Before diving into patterns, let’s quickly go over the main pillars of observability:
Logs: This helps in detailed identification of what’s going on inside a service as well as the entire system.
Tracing: This helps in identifying what happened with a request by “tracing” it via request id (or any identifier).
Metrics: These help in understanding what’s happening in the entire system at a macro scale.
Observability Patterns
Logging
Whenever there’s an issue reported with an app/service, it’s crucial to understand what was going on and the best way to know more about it is to make sure that the app writes down what it was doing at that time. This is called logging.
Since we are in a microservices world, it becomes important to follow standard practices around logging such as using a request id that spans across several services so that it becomes easier to trace where the actual issue was.
Monitoring
Apart from logging what’s going on inside a service, it also becomes crucial to understand how other dependencies are behaving by collecting metrics such as CPU utilization, memory utilization, DB read/write capacity etc. These help in identifying the overall application health.
Alerts
In order to react to any problems in your service, you require proper alert mechanisms. Once your logging is setup and that feds appropriate logs into the system then monitoring can analyze the various metrics and logs. After analysis, we can start setting up rules on the data received and if any of those rules are breached, we can setup alarms/alerts to notify the developers.
How Monitoring is different from Observability?
During an issue in a microservice world, either of these scenario can be true:
We know about the issue and why it has happened. These are facts.
We know the issue but aren’t sure why it happened. These are hypotheses.
We don’t know the issue but we can figure out why it happened. These are assumptions.
We don’t know the issue nor it’s solution. These are plain discoveries.
Monitoring helps in confirming our hypotheses whereas Observability helps us discover new issues. We monitor everything that we know can happen. Observability helps in identifying the unknows that we are not even aware of.
Conclusion
To build a robust microservice architecture, observability plays a crucial role in identifying some of the key issues which as a developer we might not be aware of. Having strong monitoring of the system should be considered one of the many things that we can do to detect anything that’s known where as to detect anything new, we should build stronger logging, monitoring and alerting systems.
If you like the post, share and subscribe to the newsletter to stay up to date with tech/product musings.
(The contents of this blog are of my personal opinion and/or self-reading a bunch of articles and in no way influenced by my employer.)