Monitoring & Observability¶

Monitoring and observability help you understand what your back-end is doing in production. By collecting metrics, logs, and traces, you can detect problems early, troubleshoot incidents, and make informed decisions about performance and reliability. An observable system exposes the right signals so that both humans and tools can see how it behaves internally.

For this product, you describe which signals you collect (logs, metrics, traces), how you collect them, and how you use them to monitor the health of your application.

Monitoring & Observability in Back-end Systems¶

Important aspects include:

Structured logging with useful context
Metrics (for example request counts, latency, error rates, resource usage)
Health checks and readiness checks
Tracing across services or components
Dashboards and alerts to visualise and react to issues

In your project you choose a subset appropriate to your scale and tooling and document how they are wired into the application.

Quality indicators¶

When assessing this product, the following quality indicators will be considered:

The monitoring & observability document is self-contained, with an introduction/context and then a description of the chosen approach.
It describes which logs, metrics, and/or traces are collected and why.
It explains how logs are structured (for example log levels, JSON logs, correlation IDs).
It describes which metrics are used to track performance and reliability (for example latency, throughput, error rates).
It describes how health checks are implemented and exposed.
It includes examples of logging, metrics, or tracing code/configuration, with references to the code in GitLab.
It describes how dashboards or alerts are configured (even if only conceptually in a student project).
It includes a list of sources used to design and implement monitoring and observability.

Template¶

To document monitoring and observability in your own project, you can use the following template:

# Monitoring & Observability

In this section, describe in a few sentences why monitoring and observability are important for your application and what kind of failures or issues you want to detect. This is the main text of your document.

## Signals & Tools

Here you describe which signals you collect:
- Logs (what you log and at which levels)
- Metrics (which metrics and which tools)
- Traces (if applicable)

Mention which libraries or platforms you use.

## Implementation

Here you describe how monitoring and observability are implemented:
- How logging is integrated into the code
- How metrics are recorded (for example counters, histograms)
- How health checks are implemented and exposed

Add code/configuration examples and references to the code in GitLab.

## Dashboards & Alerts

Here you describe how you visualise and react to the collected data:
- Example dashboards or views
- Any alerts or manual procedures you use when something goes wrong

## Sources

List here the sources you used to design and implement monitoring and observability (documentation, articles, videos, books, etc.).
Also include sources that helped you write the code.