CodeGym /Courses /Module 5. Spring /Lecture 260: How to achieve full observability in a micro...

Lecture 260: How to achieve full observability in a microservices system

Module 5. Spring
Level 20 , Lesson 9
Available

Now let's combine our tools (Zipkin, ELK, Prometheus, Grafana) to build a cohesive system for observing microservices.

Logging requests with ELK

ELK stack (Elasticsearch, Logstash, Kibana) helps centralize log collection and make it available for analysis. Each microservice sends its logs to Logstash, which forwards them to the Elasticsearch store. Then via Kibana we can visualize and analyze that data.

Example logging configuration in Spring Boot:


# application.yml
logging:
  file:
    name: logs/microservice.log
  level:
    root: INFO
  logstash:
    appender:
      enabled: true
    destination: localhost:5000 # Logstash port

On the Logstash side, the configuration looks like this:

input {
  tcp {
    port => 5000
    codec => json_lines
  }
}
output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "app-logs-%{+YYYY.MM.dd}"
  }
}

And here's the magic: we see all logs from all services in one place! Errors, requests, even the service's sense that it's having a "headache".


Request tracing with Zipkin and Spring Cloud Sleuth

Distributed tracing makes services chatty: they now tell you who called them, and who they called in response. Using Zipkin and Spring Cloud Sleuth, you build a trace made of "key events" (spans) that happen during the processing of a single request.

Example Sleuth and Zipkin configuration:

spring:
  sleuth:
    sampler:
      probability: 1.0  # 100% of requests will be traced
  zipkin:
    base-url: http://localhost:9411 # Zipkin service URL

Example of tracing in code:


@RestController
@RequestMapping("/serviceA")
public class ServiceAController {
    private final RestTemplate restTemplate;

    @Autowired
    public ServiceAController(RestTemplate restTemplate) {
        this.restTemplate = restTemplate;
    }

    @GetMapping("/callB")
    public String callServiceB() {
        String response = restTemplate.getForObject("http://localhost:8081/serviceB/hello", String.class);
        return "ServiceA -> " + response;
    }
}

Now, looking in Zipkin you'll see the call chain and how long each step took.


Monitoring metrics with Prometheus and Grafana

On to metrics. They let you measure service health (CPU, memory, request counts) or business metrics (number of successful purchases, average session length).

Spring Boot Actuator already exposes metrics, and Prometheus scrapes them. Here's how to enable Actuator to expose metrics:


management:
  endpoints:
    web:
      exposure:
        include: '*'
  metrics:
    export:
      prometheus:
        enabled: true

Then we add the service to the Prometheus configuration:


# prometheus.yml
scrape_configs:
  - job_name: 'service-a'
    static_configs:
      - targets: ['localhost:8080'] # Actuator endpoint

And with Grafana it's easy to build a dashboard that turns dry numbers into nice-looking graphs.


Putting it all together: ELK + Zipkin + Prometheus + Grafana

You've got logs, traces, metric charts — everything! So how does it all work together?

  1. Request arrives: a user sends a request to your microservice.
  2. Logging: each service logs its actions. Logs go to ELK.
  3. Tracing: Sleuth links calls into a single trace that is stored in Zipkin.
  4. Metrics monitoring: Prometheus collects performance data.
  5. Visualization: Grafana gives you the big picture: metric dashboards, log searches, and tracing.

Examples of using Observability in real systems

Big companies from Netflix to Uber use similar approaches for observability. For example:

  • Netflix: actively uses distributed tracing to analyze their complex microservices system.
  • Uber: implements metrics and tracing to optimize logistics.
  • Airbnb: mainly focuses on centralized logging and visualization.

These tools make their systems resilient, preventing downtime and large-scale failures.


Recommendations and best practices

  • Focus on key metrics. For example, request success rate, latency, resource usage.
  • Automate problem analysis. Metrics + tracing + logs should be linked. If you see a drop in a metric, jump into the trace and find the culprit.
  • Set up alerts. Add alerts in Prometheus so you find out about problems immediately.
  • Use profiles. Enable different logging levels for production and development.

Wrapping up

You now have a system where Zipkin, ELK, Prometheus, and Grafana work together to provide full observability. This isn't just a convenience — it's an essential tool when working with microservices.

You're ready to handle even the trickiest problems in your systems. Yep, now no sudden request loop can fool you. You see them through and through. Congrats — you've got X-ray vision for microservices!

Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION