The Origins of Network Server Latency &
the Myth of Connection Scheduling


Authors:

Yaoping Ruan and Vivek Pai

Abstract:

We investigate the origins of server-induced latency to understand how
to improve latency optimization techniques. Using the
Flash Web server, we analyze latency behavior under
various loads. Despite latency profiles that suggest standard queuing
delays, we find that most latency actually originates from negative
interactions between the application and the locking and blocking
mechanisms in the kernel. Modifying the server and kernel to avoid
these problems yields both qualitative and quantitative changes in the
latency profiles -- latency drops by more than an order of magnitude,
and the effective service discipline also improves.

We find our modifications also mitigate service burstiness in the
application, reducing the event queue lengths dramatically and
eliminating any benefit from application-level connection scheduling.
We identify one remaining source of unfairness, related to competition
in the networking stack. We show that adjusting the TCP congestion
window size addresses this problem, reducing latency by an additional
factor of three.


Paper:

 

Extended abstract (2 pages, 119KB)
Full paper (12 pages, 478KB)
The poster appeared in SigMetrics'04