Tests on Linux Real-Time Kernel

Introduction

We've tested ØMQ on top of a standard Linux kernel (SUSE Linux Enterprise Server 10 SP2) and real-time Linux kernel (SUSE Linux Enterprise Real Time 10 SP2). Our goal was to find out how the real-time Linux kernel improves latency jitter and specifically, how well it eliminates the latency peaks encountered occasionally with the standard Linux kernel.

Environment

Box 1:

8-core AMD Opteron 8356, 2.3GHz
Tigon3 BCM95721 (Broadcom NetXtreme BCM5721) NIC
SUSE Linux Enterprise Server 10 SP2 (kernel 2.6.16.60-0.21-smp)
SUSE Linux Enterprise Real Time 10 SP2 (kernel 2.6.22.19-0.14-rt)
ØMQ version 0.3.1

Box 2:

8-core Intel Xeon E5440, 2.83GHz
Intel PRO/1000 (631xESB/632xESB DPT) NIC
SUSE Linux Enterprise Server 10 SP2 (kernel 2.6.16.60-0.21-smp)
SUSE Linux Enterprise Real Time 10 SP2 (kernel 2.6.22.19-0.14-rt)
ØMQ version 0.3.1

Boxes were connected via non-switched 1Gb Ethernet.

Results

To test for latency, a ping-pong test with messages 1 byte long was run with 1,000,000 round trips. Round trip time was measured for each iteration and divided by 2 to get one-way end-to-end latency. When running the test on the real-time Linux kernel, we shielded 4 CPU cores (out of 8) on each box to be used exclusively by the test.

The following picture shows individual latencies using SUSE Linux Enterprise Server:

rt1.png

Our expectation was that with a real-time Linux kernel we would be able to get rid of the sporadic 400us peaks shown on the graph. We ran the same test using SUSE Linux Enterprise Real Time and got the following results:

rt2.png

As can be seen, there were no peaks over 75 us, which proved our expectations to be valid. On the other hand the trade-off for eliminating these peaks was a slight increase in average latency.

To get a better understanding of this trade-off, we've interpreted test results using the following percentile graph.

Reading the graph is simple: Say there is a point at 90% on the x-axis and 30us on the y-axis. This means that 90% (900,000) of the latencies measured were below 30 microseconds; 10% (100,000) were above 30 microseconds.

The following image shows the percentile graph for the standard Linux kernel (black points) and real-time Linux kernel (red points):

rt3.png

The most important values are summarised in the following table:

standard kernel real-time kernel
50th percentile (median) 23 us 33 us
99th percentile 24.5 us 49 us
100th percentile (maximum) 445 us 72.5 us

Conclusion

Our tests prove that the real-time Linux kernel, specifically, SUSE Linux Enterprise Real Time 10 SP2, is capable of eliminating latency spikes. It is expected that our results would be even more favourable for the real-time Linux kernel if these tests were run on boxes loaded with other tasks, rather than on a clean and idle test environment.