Package SSF.OS.NetFlow.Filter

The Filter Package: SSF.OS.NetFlow.Filter


Interface Summary
BasicFilter the interface of a basic filter
Range The range interface When configuring a range: equals -- exact match (ExactRange) min, max -- numerical range.

Class Summary
Decoder Decoder base class for all kinds of data records connected using monitors.
ExactRange This is for exact match.
Filter It's composed by one or more "Terms".
FilterData Just a wrapup data structure.
FilterPlayer It reads records from the net flow data file, pass all the records through a filter.
NumericRange Basically, given a number, it checks whether the number is within a range or not.
RERange RERange The range is specified by a Regular expression.
SetRange SetRange this Range class uses a set as reference.
Term Like Filter, but it's composed by one or more "AND phrase"(Factor).

Package SSF.OS.NetFlow.Filter Description

The Filter Package: SSF.OS.NetFlow.Filter


SSF.OS.NetFlow.Filter Implementation
Config Filters in DML
Examples and tests


The SSF.OS.NetFlow.Filter package is a new package under development, and it's expected to still have a lot of future changes and updates. The idea is straight forward: After the NetFlow data is collected and saved, the nature following step is to try to make use of it. In order to do it efficiently, it would be nice if the users are provided the way to extract just those NetFlow records that they are interested in for some more specific problem, thus here comes the Filter package. This is the package that is aimed to help the user specify the filter of their own interest and filter the NetFlow data. The same as the other part of SSFNet, the user can still config their filters using DML, as shown later below. There are reasons why the traditional relationship database is not used here. For the NetFlow data and the application using it, sequential operations counts for an absolutely majority part. In fact, random accesses are rarely used. Most of the relationship databases are optimized to have better performance on random accesses and provide a lot of other things that might not be very helpful here. Although there are also publications about stream database, not a very mature product has got into our eyesight yet.

SSF.OS.NetFlow.Filter Implementation

The few following lines should seem to be familiar for people who have the knowledge of BNF or Grammar or compilers.
<filter> = <term> OR <filter> | EMPTY
<term> = <factor> AND <term> | <factor>
<factor> = (filter) | BASECASE | NOT BASECASE

The above few lines are not exactly how the Filter is implemented, but they are the concepts behind the implementation. As you can see, generally, a filteris composed of several terms that are connected by logical OR operation; each term is composed by a few factors connected by logical AND operation; and a factor can have NOT operation within it as well as using another filter. With this general structure, the limitation of a filter implemented totally depends on the BASECASE, which will be explained in more detail later below. Together with the inheritance and replacement features provided by DML, the user can write very complicated filters of his own.

As for the basecase, let's first consider what a simplest filter will do. It is given a record, check whether it's within some range, and return a yes/no answer. The capability of the basecase totally lies in how the range can be described and how the check process is done. In the current package, there is general interface specified for range.

public interface Range
    public void config(Configuration cfg, byte dataType) 
               throws configException;                                             // describle the range
    public boolean inRange(Object data, byte dataType);        // check whether the given data is in range.
There are also two basic ranges provided in the package now. (A third one is coming out very soon, it should be in the release of SSF.OS.NetFlow.Filter package as well.) One of them is NumericRange, which is specified by a minimum and a maximum numerical value. The other is RERange (Regular Expression Range,it uses the gnu.regexp package by Wesley Warden Bigg.), which is specified by a regular expression. The third one I mentioned is SetRange, which is speicified by a given set. Of course the user defined range is also supported now, so that the user can write their own range class and plug it it.

The basic process of the filter is to extract the data of the relevant field and check whether it satisfies the requirement. The data extraction is accomplished with the help of Decoders. Each record type has its own decoder. Given a field name, the decoder will return the index of the field data in the byte array received by the filter. It has the following abstract base class:

public abstract class Decoder
    public final static int FIELD_NOT_FOUND = -1;

     * If the field doesn't exsit, return FIELD_NOT_FOUND
     * the fieldType is also passed in so that the decoder can check
     * whether the fieldType matches the one that is retrieved by fieldName.
     * @ret the index of the field, if not exist, return FIELD_NOT_FOUND
    public abstract int getFieldIndex(String fieldName, byte fieldType);

To make a demonstration of how the Filter class can be used, there is also a filterPlayer class within the package. (Actually, the Filter needs to work with an BasicPlayer object or its decendants.) It's an extended class of SSF.Util.Streams.BasicPlayer. The user can also write their own filterPlayer by extending this class. As shown by its name, this player would "show" those data records that can pass the given filter, and the filter is configured using a DML file. The following section is the detail on how to config the filter with DML.

Config Filters in DML

The configuration of a Filter has two main parts. The first part is composed by the decoders that will be used by this filter, and the other is the logic part. Below is a detailed explanation.
filter [
     decoder [ 
         name ......        #the record type name
         use ......           #the decoder class that should be used
     ... ...
     #------------------------------Other decoders
     ... ...
     #------------------------------Logic part
     term  [
         factor [
             action ......             #deny or permit, the DEFAULT is permit. It's logically the "NOT" operation.
             field_name ......     #the field from where the data should be extracted
             field_type   ......     #the data type of this field
             range [
                  #range parameters
                  #if it is Numeric Range
                  min ......
                  max ......
                 #if it is Regular Expression Range, use attribute "reg_exp"
                 #if it is user defined range, use attribute "use" to specify the class that should be used.
                 #example: use SSF.OS.NetFlow.Filter.SetRange
         # other factors of this term
     #other terms of this filter

Examples and tests

Examples and tests documents are here with the examples of the SSF.OS.NetFlow packages.


This SSF.OS.NetFlow package has included the Filter package and it should be extracted under $SSF_HOME/src/SSF/OS/. The FilterPlayer is not directly extended from SSF.Util.Streams.BasicPlayer, but it's mostly copied from it, and it will be integrated in the future.

Developed and maintained by Yougu Yuan (