What is QJump?
QJump is a collection of modifications and configurations to end-hosts and switches that provide guaranteed latency in data center networks. QJump also minimizes interference between applications sharing a network and enables fast failure detection.
QJump combines priority mechanisms with proactive rate limiting, but requires no on-line distributed coordination of traffic sources. Using the known datacenter network topology to calculate the worst-case queueing that a packet may experience, QJump derives an upper bound on network latency. It then uses standard IEEE 802.11q VLAN priorities to allow both bounded-latency (but strongly rate-limited) and best effort (but full-rate) traffic on the network at the same time. A range of intermediate options in the latency-bandwidth trade-off space are also available.
QJump is fully implemented, works on commodity Ethernet hardware and requires no modification to distributed applications. It provides a latency guarantee of 100μs for small messages on clusters of 1,000 hosts. It also offers a spectrum of higher throughput allocations with decreasing latency certainty.
State of the art
Many modern applications are supported by extensive distributed systems backends running in datacenters. Such applications must offer low end-to-end request latencies despite their complex internal structure. Similarly, dataintensive applications are working towards ever shorter deadlines, with the latest systems exchanging messages at microsecond granularity. Inevitably, such systems are increasingly sensitive to latencies.
Latency arises at many levels in the systems stack: in the application, due to process scheduling, in the network stack or in the network itself. Sources arising within a single machine can be handled locally, but in-network latencies are notoriously difficult to deal with. QJump solves these problems in an intuitive and easy to configure way.