100GbE Tests with ZeroMQ v4.3.2

Introduction

This test presents performance result of ØMQ/4.3.2 on 10Gb Ethernet. The graphs have been produced running the "libzmq/perf/generate_csv.sh" and "libzmq/perf/generate_graphs.py" scripts and using the ZeroMQ benchmarking utilities shipped in that folder.

Environment

Box 1:

NUMA nodes: 2
Server CPUs: 64-cores Intel Xeon Gold 6130 @ 2.10GHz (2 CPUs with HyperThreading enabled)
NIC: Mellanox MT27800 (ConnectX-5)
Linux/Centos7 (kernel 3.10.0-957.12.1.el7.x86_64)
ØMQ version 4.3.2, built with gcc 4.8.5

Box 2:

NUMA nodes: 2
Server CPUs: 40-cores Intel Xeon E5-2680 @ 2.80GHz (2 CPUs with HyperThreading enabled)
NIC: Mellanox MT27800 (ConnectX-5)
Linux/Centos7 (kernel 3.10.0-957.12.1.el7.x86_64)
ØMQ version 4.3.2, built with gcc 4.8.5

Boxes were connected by a direct 100Gbps fiber-optic cable.

Results for TCP transport

All the tests were run for message sizes of 8, 16, 32, … 65536, 131072 bytes.

Throughput Results

The following graph combines the achieved PPS and the achieved bandwidth obtained with benchmark utilities changing the ZeroMQ message size.
The socket type used by the benchmark utility is PUSH/PULL.

pushpull_tcp_thr_results.png

Latency Results

The following graph shows the achieved latency obtained with benchmark utilities changing the ZeroMQ message size.
The socket type used by the benchmark utility is REQ/REP.

reqrep_tcp_lat_results.png

Results for INPROC transport

For these results, only the box #1 was actually used.

Throughput Results

The following graph combines the achieved PPS and the achieved bandwidth obtained with benchmark utilities changing the ZeroMQ message size.
The socket type used by the benchmark utility is PUSH/PULL.

pushpull_inproc_thr_results.png

In the following graph, the performances of a PUB-> ZMQ PROXY -> SUB chain, all using INPROC transports, are shown:

pubsubproxy_inproc_thr_results.png