"burst latency" and general tail latencies #2

jberryman · 2021-02-24T18:16:43Z

Just leaving this here for visibility.

When benchmarking constant loads we see zipf-ish tail latencies, where the max increases with the number of samples collected (it seems).

is the burst throughput/latency metric just a reflection of this same tail latency phemonenon? Put another way: is the concept of "per-burst latency" another model we can use to motivate lowering tail latencies? (related to the more common example of the way that tail latencies affect UX on a web-page that makes many requests to render a single view)
Is there reason to expect that latencies should be distributed as they are? I have no non-hand-wavy explanation for the far outliers. Maybe the tests here provide some insight (e.g. does it suggest poor scheduling in the RTS in some way?)

Provide feedback