ØMQ (version 0.1) Tests

Introduction

Below, results of latency and throughput tests for ØMQ lightweight messaging kernel are presented. Keep in mind that the results can be heavily influenced by the whole stack including network devices, NICs, processor, operating system etc. The only way to get the results relevant to your environment is to run the tests yourself.

Test configuration

The tests were run on following configuration:

Producer box:

  • Athlon 64 X2 3800+
  • Debian GNU/Linux 4.0 (kernel 2.6.22.6)
  • Intel 82541PI (PRO/1000) 1000Base-T PCI NIC

Consumer box:

  • Pentium 4 3GHz with multithreading
  • Debian GNU/Linux 4.0 (kernel 2.6.22.6)
  • Broadcom BCM5751 1000Base-T PCI-Express NIC

1Gb network is switched by:

  • Linksys SR2024 1000Base-T switch

Results

Latency

Latency test are performed for different messages sizes, starting with 10 bytes, and gradually growing till it reaches 2000 bytes. For each message size, a message of that size is bounced back and forth 10,000 times as fast as possible. The average latency is computed afterwards. Each test is run three times in a row to capture possible latency differences between runs. The average from the three runs is presented as a result.

First have a look at latency for very small messages ØMQ is intended for. The black line represents the latency of underlying transport layer (TCP), the red one is end-to-end ØMQ latency:

lat1.png

As can be seen, the difference between TCP and ØMQ is almost unmeasurable, dropping way down bellow single microsecond.

Not have a look at TCP and ØMQ latencies for different messages sizes up to 2000 bytes (once again, ØMQ latency is marked in red, TCP latency in black):

lat2.png

The picture's message is that the underlying layers cause latency to grow in steps rather than gradually - which is completely OK. However, the other thing we see is that ØMQ reaches the thresholds sooner than raw TCP message transfer does. This issue will be investigated in the future. However, if both TCP and ØMQ transport are in-between the same thresholds, the latency difference is almost zero. To get a better picture, here's a graph showing ØMQ's overhead over underlying TCP transport for various messages sizes:

lat3.png

Throughput

Throughput tests are performed for different message sizes (1, 2, 4, 8, 16, 32, 64, 128, 256, 512 and 1024 bytes). For each message size 1,000,000 messages of that size are sent as fast as possible. Each test is repeated three times and mean value is considered to be the result.

Throughput measured in messages per second:

th.png

Throughput measured in megabits per second (note that from 128 byte message size onwards the throughput is constrained by the maximal bandwidth of 1Gb Ethernet):

bw.png

Conclusion

All in all, we've proved that we are able to deliver the solution with performance fullfiling the goals we had set. The latency overhead over underlying transport is below 1 microsecond and the throughput is over 2 millions messages a second for small messages.

In next releases we will focus on best latency/throughput ratio rather than on separate best latency and best throughput scenarios.