Package SSF.OS.NetFlow

Classes supporting the monitoring and filtering of IP packets processed by the IP protocol (SSF.OS.IP), for the collection of IP flow data from the routers in the spirit of Cisco's FlowCollector.

See:
          Description

Class Summary
BytesUtil Provides functions to easily convert short, int, string and float into/from byte[] .
IpFlowCollector Monitor that collects flows of record type: "SSF.OS.NetFlow" Revised from IpFlowCollector, give "domain information" in flow records.
IpFlowCollectorWD IpFlowCollector With Domain-support
IpFlowTable A shrinkable-hash table of IpNetFlows.
IpNetFlow  
IpNetFlowWD IpNetFlow With Domain information
NetFlow This is an ABSTRACT class that only includs few implemented functions that would be helpful to most kinds of netflows.
ShrinkableHashMap User defined HashMap.
 

Package SSF.OS.NetFlow Description

Classes supporting the monitoring and filtering of IP packets processed by the IP protocol (SSF.OS.IP), for the collection of IP flow data from the routers in the spirit of Cisco's FlowCollector. In 1.2.8, NetFlow/Filter/FlowFilter.java is removed, which caused problem in jdk1.4.0 beta version.

SSF Implementation of NetFlow - Ver1.2.8


Contents

Updates
Background of NetFlow
NetFlow implementation in SSF
NetFlow data format
Configuring monitors in DML
Examples and tests
Download
References

Updates

For version 1.2.5:

Background of NetFlow

"NetFlow" here basically refers to the continuous traffic between two entities on the Internet. To be more specific, the continuity means the time interval between any two packets is smaller than some threashhold, and "entities" here may refer to hosts or routers or even subnets. The data that can characterize the traffic is called NetFlow data. Collecting and analyzing the NetFlow data can provide important information for network planning, monitoring, and a lot of more other work.

NetFlow implementation in SSF

In the blueprint of NetFlow work in SSF, there are two main parts, collecting NetFlow data from the simulated network and analyzing NetFlow data. This document is mainly about the data collecting.

The main idea is to set up mulitple monitors on the simulated network to collect NetFlow data. The monitors are specified with DML (more detail in "Configuring monitors in DML" section.). There are some "pipes" for these monitos to dump NetFlow data. In fact, the monitors in the same timeline will share the same pipe. The other end of the pipe could be files or remote machines, where the NetFlow data is stored.

Take a closer look at how the NetFlow data is collected on a "host".  In the protocol graph of a host, there is a new protocol session called "ProbeSession". It's used to provide the interface for the "monitor" to dump data. The NetFlow data is collected through an event-driven method. The protocol session that a monitor is monitoring has a handle of the monitor. It calls the monitor when some events of interest happen. The monitor extracts the information needed and stored it in a "conOpenTable" (Connection-Open-Table). Periodically, the monitor moves the data of those finished flow to another table, from where it will be dumped to the pipe provided by "ProbeSession".

This method is actually a general method in network monitoring. As you can see, there is absolutely no restriction of what the monitor will extract and what it will do. By writing the user's customized monitor component, the user can collect the data of his/her own interest.

To write the user's customized monitor, the user must understand how the monitor components are pluged into the protocol graph or a host/router. The explanation below only uses IP and ProtocolMonitor as examples. All the monitors for the same protocol have the same interface. For IP, the interface is as below:

public interface ProtocolMonitor
{
    /** the api used to filter the packet */
    void receive(ProtocolMessage packet, ProtocolSession fromSession, ProtocolSession toSession);

    /** config this protocolsession */
    void config(ProtocolMonitor owner, Configuration cfg) throws configException;

    /** other init work that will be done in the "init" phase */
    void init();
}

The IP is slightly revised to support its monitor option, and the DML configuration of the revised IP is shown below in the "Configurnig Monitor" section. In the code of revised IP config() function, it calls a new member function createMonitor() when it finds out that the Monitor option is activated in the DML. A handle of the ProtocolMonitor created will be saved. When the revised IP finished routine initialization, it also calls the init() or the ProtocolMonitor to complete initializing the monitor. Another change of IP is in the push function, and it wakes up the ProtocolMonitor when appropriate events happen. In the implementation now, the receive function of Monitor is not invoked when the packet exhausts its TTL and got dropped. (This may be changed as time goes on.)

In order to minimize the impact of introducting monitors to the simulation, it's critical to have fast operation when insert, delete and retrive the flow data in the flow table. ShrinkableHashMap is implemented for this purpose. It has the following characteristics:

NetFlow data format

In the real world, the Cisco routers also provide the service of collecting and analyze NetFlow data. The user specify some machine to store the data, and the routers collecting the NetFlow data will send it to that machine with UDP protocol. There are several versions of the NetFlow data of Cisco routers [1], and below is a comparation between version7 Cisco NetFlow data format and the data format in SSF implementation.

Common fields

src_addr Source IP address of the NetFlow (If there is src_mask, then it's the address after applying the mask.)
dst_addr Destination IP address of the NetFlow (If there is dst_mask, then it's the address after applying the mask.)
input The NIC from which the traffic is coming in. In Cisco NetFlow ver7.0, it's always set to be 0.
output The NIC from which the traffic is going out.
nextHop The next hop IP. In cisco NetFlow7.0, it's always set to be 0.
First The system time of receiving the first packet of this flow.
Last The system time of receiving the last packet of this flow.
dPkts Number of packets in this flow by now.
dOctets Number of bytes of data (above 3 layer) in this flow by now
tcp_flag The cumulative OR of tcp flags, if any.
protocol The protocol type of the traffic that this netflow is about.

Different fields

In Cisco ver7.0

srcport TCP/UDP source port number, set to 0 if the flow mask is destination-only or source-destination.
dstport TCP/UDP destination port number, set to 0 if the flow mask is destination-only or source-destination.
flags Flags indicating, among other things, what fields are invalid
tos IP type of service, it's set to be the ToS of the first packet of this flow.
src_as Source autonamous system number, either origin or peer, always set to be 0.
dst_as Destination aumonumous system number, either origin or peer, always set to be 0.
src_mask Source address prefix mask, always set to be zero.
dst_mask Destination address prefix mask, always set to be zero. 
router_sc IP address of the router that is bypassed by the Catalyst 5000 series switch. This is the same address the router uses when it sends NetFlow export packets. 

In SSF implementation

nhi The NHI address of the router that collects the flow data. Functionally it's the same as "router_sc" in Cisco ver7.0
src_mask Source address suffix mask. It's used so that some pre-aggregation can be done in SSF simulation.
dst_mask Destination address suffix mask. Similar to src_maks, it's used to help pre-aggregation.
inputType Type of the input port: flow came in from an external host or an internal one. (for IpFlowCollectorWD)
outputType Type of the output port: flow forwarded to an external host or an internal one. (for IpFlowCollectorWD)

Configuring monitors in DML

This section is a tutorial of how to specify monitors with DML in the SSF network configuration file. For each protocol (or NIC), its own monitor could be specified with DML. To help explanation, IP and ProtocolMonitor will be used as examples below.

In IP protocolSession, an attribute "Monitor" will be added. The following is an example of configuring IP with DML.

 ProtocolSession [
        name ip use SSF.OS.IP                                                      # REVISED IP, in release 1.2
        monitor [
              use SSF.OS.Filters.IpFlowCollector
              debug true
               ... ...                                                                             # other attributes of the monitor
               ... ...
        ]
        ... ...                                                                                    # other attributes of IP
        ... ...
  ]

Besides that, a configuration for the "probe" protoclSession is also needed.

 ProtocolSession [
        name probe use SSF.OS.ProbeSession                  # in release 1.2
        file "flow_data/flows.dat"                                        # specify the filename prefix of dumped data
        stream ipnetflow                                                     # stream name
 ]
Another thing to point out: After the stream name is specified, the ProbeSession will grab it, append it with "alignment" string. So in the NetFlow data file, the name of the stream will be "ipnetflow.0" instead of "ipnetflow" specified with "stream". (Assume that the alignment is "0".)

Examples and tests

More examples and test document will be added soon.

Download

The source file of SSF.OS.NetFlow (Including SSF.OS.NetFlow.Filter). (The name of the package is subject to change.) It should be extracted under SSF.OS, and it will generate the package SSF.OS.NetFlow.

The test sub directory has several tests and examples. Please get into tests directory and type "make" to see all the targets.

References

[1] Cisco NetFlow Export Datagram Format