Traffic Monitoring

Introduction

This article describes how to perform network traffic monitoring based on business criteria. First it describes the underlying principle, then it introduces some example business logic and finally the whole application is monitored using Wireshark network protocol analyser.

Background

The basic idea of the fine-grained business-oriented network traffic monitoring is to pair specific business logic with a specific network address. In this example we'll pair individual business feeds with specific TCP ports.

Let's say we want to distribute messages from a stock exchange to the traders. We'll consider only three distinct message feeds. Stock quotes are messages meant to inform trader about the current prices of the stock. Order confirmations let the trader know that his orders (to buy or sell the stock) were accepted by the exchange. Trades are notifications that the orders were executed (i.e. the stock was either bought or sold).

Each of these message feeds will be associated with a specific TCP port number. In our example code stock quotes are transferred on port 33333, order confirmations on port 33334 and trades on port 33335.

IP port number being a part of TCP packet header is easy to monitor even on the lowest layers of the stack (either hardware of software). Monitoring program (or hardware device) simply checks two bytes at exact offset in each TCP packet ("source port" field) and updates the statistics accordingly.

While this approach seems simple and self-evident, most business messaging solutions are incapable of it. For a discussion of traditional vs. ØMQ style of handling concurrent message feeds have a look at the conclusion of this article.

Business logic

In this example we will use two simple test programs simulating the communication between stock exchange and trader. monsend program plays the role of stock exchange and publishes three distinct feeds of messages. monrecv acts as a trader, receiving all the feeds. There can be several instances of monrecv (stock trader) running in parallel:

monitor3.png

What follows is the monsend code. All the messages are 10 bytes long. We don't even care to fill in the message body as it's just an example so exact content of the message is irrelevant. Messages are sent at random, however, the program is tuned to send ~75% of stock quotes, ~20% of order confirmations and ~5% of trades. Also note the sleep period after each message (1ms). The intent is to get decent data flows to monitor rather then completely congested environment.

#include <unistd.h>
#include <stdlib.h>
#include <zmq.hpp>
 
int main ()
{
    //  Initialise 0MQ infrastructure.
    zmq::context_t ctx (1);
    zmq::socket_t quote_socket (ctx, ZMQ_PUB);
    quote_socket.bind ("tcp://lo:33333");
    zmq::socket_t confirmation_socket (ctx, ZMQ_PUB);
    confirmation_socket.bind ("tcp://lo:33334");
    zmq::socket_t trade_socket (ctx, ZMQ_PUB);
    trade_socket.bind ("tcp://lo:33335");
 
    while (true) {
 
        //  Send messages to different feeds based on 75%/20%/5% ratio.
        int r = rand () % 100;
        zmq::message_t msg (10);
        if (r < 75)
            quote_socket.send (msg);
        else if (r < 95)
            confirmation_socket.send (msg);
        else if (r < 100)
            trade_socket.send (msg);
 
        //  Wait 1ms not to get into congestion.
        usleep (1000);
    }
 
    return 0;
}

As for the monrecv program, it's even simpler. It receives from all three message feeds. Then it retrieves the messages and drops them immediately as they arrive:

#include <stdlib.h>
#include <zmq.hpp>
 
int main ()
{
    //  Initialise 0MQ infrastructure.
    zmq::context_t ctx (1);
    zmq::socket_t s (ctx, ZMQ_SUB);
    s.connect ("tcp://localhost:33333");
    s.connect ("tcp://localhost:33334");
    s.connect ("tcp://localhost:33335");
    s.setsockopt (ZMQ_SUBSCRIBE, "", 0);
 
    //  Receive messages. No processing is done.
    while (true) {
        zmq::message_t msg;
        s.recv (&msg);
    }
 
    return 0;
}

Monitoring

First, run monsend application. Then run monrecv application. The code above is written to use loopback interface so both applications have to be run on the same box. (Modifying the bind/connect strings in the code to make the example run on separate boxes is trivial though.) At this point the application is running, passing messages from monsend to monrecv. We can start monitoring the traffic.

Run Wireshark monitoring tool and start capturing packets on the loopback interface ("lo").

Open the statistics window (Statistics | IO Graphs). Fill in appropriate filters to show the three feeds we are interested in - stock quotes at port 33333, order confirmations at port 33334, trades at port 33335:

monitor1.png

Red line represents stock quotes, green line represents order confirmations, blue line represents trades. The statistics are charted using bytes per 0.1 second as a unit on y-axis (see the drop-down box in the bottom right corner of the window).

Now, let's start two more instances of monrecv.

The I/O graph produced by Wireshark clearly shows that second instance was started at 10:17:53.8, while third instance began running at 10:17:57.8:

monitor2.png

Results obtained by the monitoring can be used simply to be informed of the actual bandwidth requirements of the application. However, you can use them as well to analyse and improve overall design of the application data flow. For example, we may be concerned about the bandwidth consumed by stock quotes (red line). Results indicate that each running instance of monrecv requires approximately 50kB/0.1sec of bandwidth just to handle stock quotes. The fact can make us consider using PGM reliable multicast protocol instead of TCP for stock quotes - multicast would use constant amount of network bandwidth no matter how many instances of monrecv are running.

In case you are concerned about latency impact of the network monitoring using Wireshark, the above "business feed as network address" principle makes it easy to use a hardware-based monitoring solutions, such as those offered by Endace and other vendors.

QoS and traffic shaping

The same principle applies to QoS and traffic shaping. Pairing business logic with ports gives you an opportunity to do traffic shaping based on business criteria. The big advantage is that it's done on networking level. Implementing QoS on middleware level (as done in most messaging systems) proves inadequate once physical network issues are taken into account.

In our case we can limit the amount of bandwidth assigned to stock quotes to 5120 kb/sec by configuring a network router (Cisco IOS):

access-list 100 permit tcp any any eq 33333                                     
class-map match-any port33333                                                   
  match access-group 100
policy-map port33333                                                            
  class port33333                                                               
   bandwidth 5120      
interface Ethernet0                                                             
 service-policy output port33333

This way stock quotes won't be able to overflow the network. Even if there's a trading peak, router itself will apply the QoS and prevent network outages experienced by the rest of your system.

Conclusion

The possibility of pairing business logic with network address gives you a powerful tool to monitor and analyse your business traffic.

Unfortunately, most traditional business messaging solutions don't offer this feature. The problem is that in the traditional design all the applications speak to message broker using a single port. For example, AMQP-based systems use predefined port number 5672 for all the traffic. While the implementations may be tweaked to use different ports, the tweaking isn't easy. It requires the broker to listen on several ports and manage the listening ports in dynamic fashion. It requires clients to open several connections to the broker etc. With ØMQ all this is inherent part of design and available as a bonus.

Wireshark is available on most OS platforms. If it is not available on the platform you are using, you should be able to get an equivalent (though maybe not that potent) tool elsewhere.

Comments: 2

Add a New Comment