Last quarter I spent a frustrating week chasing a latency regression in one of our payment routing services. The p99 grew from 12ms to 80ms after we doubled traffic, but CPU was idle and the database itself was bored. The culprit was a misconfigured pgxpool that I had inherited a
HTTP/2 flow control gives you per-stream window updates, but if you push a producer faster than the consumer can read, the server-side queue starts to grow before the window closes. Reactor and Akka Streams both expose a request-driven cursor that makes this explicit — here is how I wired it for a streaming aggregation endpoint that backs a real-time dashboard.
Session, transaction and statement pooling each break a different subset of postgres client features. After the third incident with a long-running advisory lock surviving a connection reuse, I went back through our connection inventory and re-tagged each consumer with the minimum pooling mode it can actually run in. Notes on the matrix below.
Head sampling at the SDK is cheap but blind — you decide whether to keep a trace before the trace finishes. Tail sampling at the collector keeps the whole trace in memory until the root span closes, then runs a policy. Memory budgeting and the trade-off between sample rate and policy latency are the interesting parts.
Rolling deploys with 40 consumer instances generated more than a minute of stop-the-world rebalance per release, which started leaking SLO budget after we ramped throughput. Setting a stable group.instance.id and tuning session.timeout.ms moved that cost down to a few seconds per restart.