Caution: This document refers to an old version of ØMQ. From version 0.3.1 onwards Java extension is integral part of ØMQ and thus it doesn't have to be downloaded and built separately! Also note that Java API have changed since!
Table of Contents
|
Introduction
This whitepaper describes first version of the Java extension for ØMQ. It is simplified version of ØMQ interface exposed in the form of Java object. Java extension is not yet part of ØMQ package. You have to download it separately (see below) and build it by hand. Any feedback on Java extension is welcome on ØMQ developer's mailing list.
Download
Download Java extension for ØMQ here.
Building it
Download and build ØMQ package:
$ tar -xzf zmq-0.3.tar.gz
$ cd zmq-0.3
$ ./configure
$ make
$ sudo make install
Download un unpack Java extension for ØMQ:
$ tar -xzf Jzmq.tar.gz
$ cd Jzmq
Compile Jzmq class:
$ javac Jzmq.java
Generate JNI headers for the class:
$ javah Jzmq
Compile the extension:
$ g++ -c -fPIC Jzmq.cpp
$ g++ -shared -pthread -o libJzmq.so Jzmq.o libzmq.so
Copy the shared library onto the library path:
$ cp libJzmq.so /usr/lib
Compile the test programs:
$ javac LocalLat.java
$ javac RemoteLat.java
$ javac LocalThr.java
$ javac RemoteThr.java
Using it
Java extension's API is currently much simpler when compared to original C++ API. The difference is that Java extension doesn't allow for full control of ØMQ threading as C++ does. Instead, Java extension creates single I/O thread that can be accessed from a single application thread. This doesn't allow for seamless scaling on multicore boxes. However, it is our intent to expose full ØMQ API via Java in the future.
To instantiate ØMQ:
Jzmq obj = new Jzmq (hostname);
Where hostname is name or IP address of the box where zmq_server is running.
To create wiring, createExchange, createQueue and bind functions can be used. For detailed description of how wiring mechanism works have a look here.
int eid = obj.createExchange ("E", Jzmq.SCOPE_GLOBAL, "10.0.0.1:5555");
obj.createQueue ("Q", Jzmq.SCOPE_GLOBAL, "10.0.0.1:5556");
obj.bind ("E", "Q");
Sending a message is pretty straightforward. Message is supplied in form of byte array:
byte msg [] = {1, 2, 3, 4, 5, 6};
obj.send (eid, msg);
Receiving a message is even more simple:
byte [] msg = obj.receive ();
Test results
Tests were performed on two quadcore boxes (Intel Xeon CPU, E5440, 2.83 GHz) connected via direct 1Gb Ethernet link (Intel PRO/1000, PCI Express:2.5GB/s:Width x4). Operating system used was Debian Linux 4.0 (kernel version 2.6.24.7, CONFIG_PREEMPT_VOLUNTARY=y, CONFIG_PREEMPT_BKL=y, CONFIG_HZ=1000).
Latency
End-to-end latency - as measured by LocalLat and RemoteLat - is quite nice. For small messages Java is just couple of microseconds slower than raw C++ program:
Message size | C++ | Java |
---|---|---|
1 B | 32.7 us | 35.62 us |
16 B | 34.54 us | 37.17 us |
256 B | 42.21 us | 43.37 us |
4096 B | 85.63 us | 102.31 us |
65536 B | 612.99 us | 769.7 us |
Same values charted on the graph (black line is C++, red line is Java):

The main performance bottleneck in the Java extension is that message data have to be physically copied between Java heap and JNI heap - the copying happens on both send and receive side. As far as we are aware there's no way to avoid it. In any case, the bottleneck becomes significant only for large messages (i.e. messages over 512 bytes long). For smaller messages, you don't have to worry - copying overhead will be almost unmeasurable.
Throughput
As expected, Java is somehow less efficient than raw C++. Until network limit (1Gb/sec) is reached the throughput is approximately 50% of the C++ throughput. However, once the messages are large enough to exhaust the network (~256 bytes) the throughputs of C++ and Java are exactly the same.
Message size | C++ | Java |
---|---|---|
1 B | 2,435,820 msgs/sec | 1,453,288 msgs/sec |
16 B | 2,976,623 msgs/sec | 1,262,061 msgs/sec |
256 B | 447,126 msgs/sec | 447,494 msgs/sec |
4096 B | 28,896 msgs/sec | 28,907 msgs/sec |
65536 B | 1,810 msgs/sec | 1,810 msgs/sec |

Conclusion
Although Java extension doesn't provide full ØMQ functionality at the moment, performance figures are quite convincing. Latency is only few microseconds above C++ latency (~36 us) and throughput, although worse than in C++, would still allow for decent handling of OPRA feed.