C API for ØMQ

Caution: This document refers to an oldversion of ØMQ. From version 0.3.1 onwards C extension is integral part of ØMQ and thus it doesn't have to be downloaded and built separately! Also note that the C API have changed since!

Introduction

This whitepaper describes first version of the C extension for ØMQ. It is simplified version of ØMQ interface. C extension is not yet part of ØMQ package. You have to download it separately (see below) and build it by hand. Any feedback on C extension is welcome on ØMQ developer's mailing list.

Download

Download C extension for ØMQ here.

Building it

Download and build ØMQ package:

$ tar -xzf zmq-0.3.tar.gz
$ cd zmq-0.3
$ ./configure
$ make
$ sudo make install

Unpack and build C extension:

$ tar -xzf czmq.tar.gz
$ cd czmq
$ g++ -c -fPIC czmq.cpp
$ g++ -shared -pthread -o libczmq.so czmq.o libzmq.so

Build test programs:

$ gcc -o local_lat local_lat.c libczmq.so
$ gcc -o remote_lat remote_lat.c libczmq.so
$ gcc -o local_thr local_thr.c libczmq.so
$ gcc -o remote_thr remote_thr.c libczmq.so

Using it

C extension's API is currently much simpler when compared to original C++ API. The difference is that C extension doesn't allow for full control of ØMQ threading as C++ does. Instead, C extension creates single I/O thread that can be accessed from a single application thread. This doesn't allow for seamless scaling on multicore boxes. However, it is our intent to expose full ØMQ API via C in the future.

To instantiate ØMQ:

void *handle;
handle = czmq_create (host);

Where hostname is name or IP address of the box where zmq_server is running. Returned handle will be used in all the other functions to identify this particular instance of ØMQ.

To create wiring, czmq_create_exchange, czmq_create_queue and czmq_bind functions can be used. For detailed description of how wiring mechanism works have a look here.

int eid;
eid = czmq_create_exchange (handle, "E", CZMQ_SCOPE_GLOBAL, "10.0.0.1:5555");
czmq_create_queue (handle, "Q", CZMQ_SCOPE_GLOBAL, "10.0.0.1:5556");
czmq_bind (handle, "E", "Q");

To send a message, you have to supply a buffer, its size and the function to be used to deallocate the buffer once it's no more needed:

void *buf;
buf = malloc (10);
memset (buf, 0, 10);
czmq_send (handle, eid, buf, 10, free);

Receiving a message hands you a buffer, its size and the function you should use to deallocate the buffer:

void *buf;
size_t size;
czmq_free_fn *ffn;
czmq_receive (handle, &buf, &size, &ffn);
if (ffn)
    ffn (buf);

To shut down ØMQ infrastructure use the following function:

czmq_destroy (handle);

Test results

Tests were performed on two quadcore boxes (Intel Xeon CPU, E5440, 2.83 GHz) connected via direct 1Gb Ethernet link (Intel PRO/1000, PCI Express:2.5GB/s:Width x4). Operating system used was Debian Linux 4.0 (kernel version 2.6.24.7, CONFIG_PREEMPT_VOLUNTARY=y, CONFIG_PREEMPT_BKL=y, CONFIG_HZ=1000).

Latency

End-to-end latency - as measured by local_lat and remote_lat - is only slightly higher than C++:

Message size C++ C
1 B 32.7 us 35.43 us
16 B 34.54 us 36.26 us
256 B 42.21 us 44.04 us
4096 B 85.63 us 88.64 us
65536 B 612.99 us 651.83 us

Same data can be seen on the following grpah (black line is C++, red line is C):

c_lat.png

Throughput

C extension is somehow less efficient than raw C++ code - possibly due to omission of "VSM" optimisation from the C extension. Until network limit (1Gb/sec) is reached the throughput is approximately 50% of the C++ throughput. However, once the messages are large enough to exhaust the network (~256 bytes) the throughputs of C++ and C are exactly the same.

Message size C++ C
1 B 2,435,820 msgs/sec 1,624,139 msgs/sec
16 B 2,976,623 msgs/sec 1,428,036 msgs/sec
256 B 447,126 msgs/sec 447,534 msgs/sec
4096 B 28,896 msgs/sec 28,893 msgs/sec
65536 B 1,810 msgs/sec 1,810 msgs/sec
c_thr.png

Conclusion

Although C extension doesn't provide full ØMQ functionality at the moment, performance figures are quite convincing. Latency is only few microseconds above C++ latency (~35 us) and throughput, although lower than in C++, would still allow for decent handling of OPRA feed.

c