++++

Engineering

Apr 2026×12 min read

How request correlation, structured log events, and environment-controlled verbosity come together to give you visibility into a running FastAPI service without leaking sensitive data.

8. Logging & Observability

Driptanil DattaSoftware Developer

The Problem with Unstructured Logs

When something goes wrong in production, you have two tools: the logs and the request. If your logs are a stream of unformatted strings with no consistent fields, correlating a failure to a specific request is painful. You're grep-ing for timestamps, matching against partial URLs, and hoping you captured enough context.

This template solves that with structured logs and request correlation. Every log entry for a given request carries the same request_id, so you can filter by that ID and reconstruct the full picture of what happened.

Request Correlation

The logging middleware in server/api/src/utils/logging.py is attached to the app in create_app(). Every request that enters the server gets a request_id — either from the X-Request-ID header if the caller provided one, or freshly generated if not:

# server/api/src/utils/logging.py
request_id = request.headers.get(REQUEST_ID_HEADER, str(uuid4()))
request.state.request_id = request_id

This request_id is stored on request.state, so it's accessible anywhere in the request lifecycle. It's also echoed back in the response header:

response.headers[REQUEST_ID_HEADER] = request_id

If your client (another service, a test harness, a frontend) sends an X-Request-ID, that same ID comes back in the response. You can trace a request across service boundaries using the same ID throughout.

What Gets Logged and When

After the request completes, the middleware logs a single structured event with the operational fields that actually matter for debugging:

logger.bind(
    event="http.request.completed",
    request_id=request_id,
    method=request.method,
    path=request.url.path,
    status_code=response.status_code,
    duration_ms=round(duration_ms, 3),
    client_ip=_client_ip(request),
).info("")

event, request_id, method, path, status_code, duration_ms, client_ip — these are the fields you'll actually query in a log aggregator. Named events (http.request.completed, http.request.failed) give you consistent filter keys that don't depend on parsing free-form strings.

If the request throws an unhandled exception, the failure path captures the error type and message alongside the same operational fields:

logger.bind(
    event="http.request.failed",
    request_id=request_id,
    method=request.method,
    path=request.url.path,
    status_code=500,
    duration_ms=round(duration_ms, 3),
    client_ip=_client_ip(request),
    error_type=exc.__class__.__name__,
    error=str(exc),
).exception("")

.exception("") attaches the full stack trace in addition to the bound fields, so you have both the high-level summary and the detailed traceback in the same log entry.

Controlling Log Behavior via Environment

Three environment variables control how the logger behaves:

LOG_LEVEL=INFO        # minimum level to emit (DEBUG, INFO, WARNING, ERROR)
LOG_JSON=false        # false for human-readable, true for JSON (use true in production)
LOG_DIAGNOSE=false    # true adds variable values to tracebacks (never use in production)

In development, LOG_JSON=false gives you readable output. In production, LOG_JSON=true emits newline-delimited JSON that log aggregators (Datadog, Loki, CloudWatch) can parse and index automatically.

LOG_DIAGNOSE=false is important. When true, Loguru includes the values of local variables in exception tracebacks — useful for debugging, but it will print passwords, tokens, and PII to your logs if any of those happen to be in scope. Keep it false in production.

What Not to Log

The most dangerous logging mistake is logging authentication data:

# What not to do
logger.error(
    "Authorization=%s password=%s",
    request.headers.get("Authorization"),
    raw_password,
)

This writes JWTs and plaintext passwords to your log files, log aggregators, and anywhere logs are shipped. Once a token is in a log, it's in every downstream system that ingests logs. Rotate the secret, but the historical logs still contain the old tokens.

The rule is: log operational context (IDs, paths, status codes, timing, error types) and never log security credentials, raw Authorization headers, or user-supplied field values that might contain PII.

App Lifecycle Events

Startup and shutdown are also logged as structured events. In server/api/src/main.py, the lifespan context calls dedicated helpers:

@asynccontextmanager
async def lifespan(_app: FastAPI):
    log_app_startup(service=SERVICE_NAME, version=SERVICE_VERSION)
    yield
    log_app_shutdown(service=SERVICE_NAME, version=SERVICE_VERSION)

These emit app.startup and app.shutdown events with the service name and version. In a log aggregator, you can use these events to correlate deployment timing with changes in error rates — a restart followed immediately by elevated 500s usually means a bad deploy.