Package SSF.App.Worm

This package models the spread of a worm in a network using a macroscopic epidemic model.

See:
          Description

Class Summary
AS An (abstract representation of an) Autonomous System.
ASGraph The Internet at the AS (interdomain) level.
CodeRedGammaFunction Models removal observed during Code Red v2.
CodeRedWormEpidemicInitializer Initialize worm epidemic using empirical distributions based on routing table data and published data on Code Red II infections in August 2001.
CR2SuscFracDistr Distribution of fraction of an ASs announced IP space populated by hosts that were susceptible to the Code Red II worm in Sept 2001.
CR2SuscFracDistr.TestFixture Unit test code.
DeterministicWormEpidemic Worm Epidemiological model.
DeterministicWormEpidemicState Deterministic epidemic model implementation epidemic model.
GammaFunction Abstract base class for epidemic model gamma function (removals).
GatewayProtocolSession Pseudoprotocol for border routers (gateways).
IPSpaceDistr Distribution of announced IP space sizes for ASs.
IPSpaceDistr.TestFixture Unit test code.
MacroscopicModel Macroscopic model of the Internet, for modeling worm epidemics etc.
MacroscopicModelConfigurator Common code for configuring options to the macroscopic model.
MeanRateWormTraffic Mean scan rate model of worm induced scan traffic, global object.
MeanRateWormTrafficState Worm induced traffic model.
StochasticWormEpidemic Worm Epidemiological model.
StochasticWormEpidemicState Stochastic epidemic model implementation epidemic model.
UniformWormEpidemicInitializer Initialize worm epidemic with a 'uniform distribution', i.e.
WormEpidemic Worm Epidemiological model for one AS.
WormEpidemicInitializer Abstract base class for initialization code for the epidemic model.
WormEpidemicState Worm epidemiological state/model of one AS.
WormProtocolSession Models the worm infestation in a single host.
WormRecorder  
WormTraffic Abstract base class for model of worm induced scan traffic, global object.
WormTrafficState Abstract base class for local state for model of worm induced scan traffic.
ZeroGammaFunction Model _no_ removals.
 

Package SSF.App.Worm Description

This package models the spread of a worm in a network using a macroscopic epidemic model. Since the epidemic may encompass hundreds of thousands of hosts, the idea is to partially decouple the worm spread from the packet-level network model. Thus, we model the "whole Internet" at a coarse macroscopic level (for the worm spread) and some part of it in more detail (the "microscopic" or network level) using SSFNet entities such as routers and hosts. This is illustrated in Figure 1.



Figure 1: Mixed abstraction level model.

The rationale behind this mixed abstraction level model and the benefits and trade-offs involved are described in the paper [Liljenstam et al., 2002].

Package Specification

Version 0.5.1 supports the following features:

Planned features for future releases include:

Please see [Liljenstam et al., 2002] for more details on the models and assumptions.

The link between the network and epidemic models is provided by two pseudo-protocol-sessions: WormProtocolSession and GatewayProtocolSession. The first instance created of these sessions will create the global epidemic model entity and a timer object to drive it forward. Hence, every model that uses the worm package must include at least one instance of either WormProtocolSession or GatewayProtocolSession.

All WormProtocolSession instances will register with the epidemic model to signal that they each represent a vulnerable host. As the infection progresses the epidemic model will pick hosts at random for infection and may thus pick hosts that are modeled at the network level. When this happens it will call the method WormProtocolSession.becomeInfected() which is an empty placeholder for events to happen when a host has become infected. Of course, the network level may also contain any number of hosts that are not vulnerable to the worm. By not running the WormProtocolSession, the host is not mapped to the macroscopic level and will never be picked for infection.

Instances of the GatewayProtocolSession also register at the macroscopic level. If the scan traffic model is configured (an thus invoked), then these registered gateways received information on the worm scan traffic rate hitting the router. This can be used for instance to inject packets at the network level or model router stress. Note that traffic modeling only makes sense for a stratified macroscopic model.

In a homogeneous model this mapping of network-level hosts to the macroscopic level is quite straight forward; all hosts are mapped into a big 'cloud' at the macroscopic level. More interesting is the case of a stratified model (and in this model we stratify by AS). In this case network entities (hosts and routers) belonging to a certain AS at the network level is mapped to the corresponding AS at the macroscopic level, as illustrated in Figure 2. Here Net attributes that represent AS boundaries are required to contain the ASN attribute, where the user specifies the real AS number that this network should be mapped to at the macroscopic level. Hence, hosts and routers belonging to this Net will map to the AS given by the ASN at the macroscopic level. Note that there will effectively be one interdomain level topology at the network level and one interdomain level topology at the macroscopic level. Also note that there is currently no check if these topologies are consistent or make any sense with respect to each other.



Figure 2: Mapping between abstraction levels.

See the configGlobalOptions() method of MacroscopicModelConfigurator (and examples) for more information on how to configure the parameters for the macroscopic model, both epidemic and traffic.

Installation:

Unpack the package tar-ball in the ssfnet (root) directory. Then refer to package README file for instructions on compiling the package, generating javadoc, and running 'validation' tests.

Examples:

We start with a simple example to describe the overall capabilities of the package. Then we proceed with smaller examples to demonstrate the DML configuration options. All of the examples described here are in the test subdirectory and many of them are used as regression tests for the package.

Campus Network Under Attack

This first example provided here requires a system with Perl and gnuplot installed.

Go to the test subdirectory.
The perl-script campusTestPlot.pl will run the campusUnderAttack.dml, and extract data on host infections from the debug output and plot the results using gnuplot. The scenario is the exact same network as the campus2.dml network in the littleComboDemo SSFNet example, with the difference that all hosts (clients and servers) are vulnerable to the worm. How the DML is configured will be described in more detail later on. Suffice it to say here that the changes made to the campus2.dml to achieve this are minimal:
  1. The WormProtocolSession has been added to the hosts' protocol stacks.
  2. The parameters of the macroscopic model are configured at the top-most level Net in the worm_model attribute.

The example (approximately) models the spread of the Code Red v2 worm and infections occurring on the campus network.

The resulting graph should look like Figure 3.



Figure 3: Example graph: CampusUnderAttack scenario. Shows global infection spread (number of hosts infected in the whole Internet), and time-points for local infections in the studied campus network

Single Vulnerable Host (Deterministic Model)

For a minimal example in the same vein, see singleHost.dml which models a single home user during an ongoing worm attack (parameters for Code Red v2). In this example the network model does nothing except start up the epidemic and wait for something to happen. If the epidemic runs long enough, the user's host will eventually become infected.

The relevant DML sections are shown below in Figure 4. The macroscopic level is configured using the global worm_model attribute. Inside it we set the parameters for the epidemic model in the Epidemic attribute, and switch on debug output. The single host modeled on the network level runs the WormProtocolSession to signal that it is vulnerable to the worm.

Net [
  frequency 1000000000
  AS_status boundary
  ospf_area 0

  worm_model [
    Epidemic [
      s_0     359999   # number of susceptible hosts (initially) =N-1
      i_0          1   # number of infected hosts (initially)
      beta  1.235e-9   # infection parameter =(1.6/3600)/N
    ]

    debug true
  ]

...

  clientGraph [graph [
    ProtocolSession [
      name WormProtocolSession
      use SSF.App.Worm.WormProtocolSession
      debug true
    ]
    ProtocolSession [name ip use SSF.OS.IP]
  ]
]

Figure 4: DML for minimal example with a single vulnerable host at the network level (microscopic level). This host is one of 359,999 vulnerable hosts in the whole Internet, and the scenario starts with a single infected host somewhere in the Internet. By default a deterministic epidemic model is used (system of differential equations) and a homogeneous population.

The infection will show up in the debug output (as shown in this output clip), but otherwise will have no effect on the model, since no action is coded into the the host.

Stochastic Epidemic Model

This same scenario can also be modeled using a stochastic epidemic model. In DML (singleHostStoch.dml) this is done by replacing the default implementation of the epidemic model with another class, as shown below.
  worm_model [
    Epidemic [
      use SSF.App.Worm.StochasticWormEpidemic
      s_0     359999   # number of susceptible hosts (initially) =N-1
      i_0          1   # number of infected hosts (initially)
      beta  1.235e-9   # infection parameter =(1.6/3600)/N
    ]

We can compare the evolution of the epidemic in the two models (deterministic and stochastic) as in the semi-log plot in Figure 5.



Figure 5: Example comparison between epidemic evolutions for a deterministic and a stochastic model. (Semi-log plot.) As the population of infected grows, the stochastic model settles down to the mean growth rate predicted by the deterministic model.

Note that the epidemic grows until all vulnerable hosts are infected (unlike the campus example, where the epidemic decreased towards the end. That is, the campus example also modeled removals of infected hosts as observed during the Code Red v2 event. The DML clip below illustrates how this is done in campusUnderAttack.dml by substituting the implementation class for the 'gamma function' used to compute removals.

  worm_model [
    Epidemic [
      s_0     359999   # number of susceptible hosts (initially) =N-1
      i_0          1   # number of infected hosts (initially)
      beta  1.235e-9   # infection parameter =(1.6/3600)/N

      gamma_function SSF.App.Worm.CodeRedGammaFunction
    ]

    debug true
  ]

A Simple Stratified Model

One can study worm propagation in different parts of the network (the Internet) using a stratified model. The package supports stratification by AS, i.e. each AS forms a subpopulation. The (twoASsingleHosts.dml) example depicts a simple scenario with two ASes modeled at the network level (microscopic level). AS one contains a single vulnerable host (at the network level). The DML clip below illustrates how a stratified model is switched on through the stratified_on attributed and an inter-domain topology graph is loaded using the as_graph attribute. When using a stratified model it is also important to set the initial values for the subpopulations in the epidemic model, i.e. the inital fraction of susceptibles and infected assigned to different strata (ASes). The class to use for initialization is set through the initializer attribute. This example uses the SSF.App.Worm.CodeRedWormEpidemicInitializer to initialize based on an empirical distribution derived from Code Red II data. Lastly, the AS_num attribute of the AS boundary Net:s must be set to choose which AS to map to at the macroscopic level.

  worm_model [
    stratified_on  true                # stratified model (by AS)
    as_graph       ex_as_topology.adj  # AS level topology file (adjacencies)

    Epidemic [
      s_0     359999   # number of susceptible hosts (initially) =N-1
      i_0          1   # number of infected hosts (initially)
      beta  1.235e-9   # infection parameter =(1.6/3600)/N

      initializer SSF.App.Worm.CodeRedWormEpidemicInitializer
    ]

    debug true
  ]

  Net [ id 0
    AS_status boundary
    AS_num 557 # ASN 557, just some AS with outdegree one in the adjacency file
    ospf_area 0
...

Worm Scan Traffic

The package also includes a very simple model of worm scan traffic as it passes through egress border routers. It models the mean rate of scan packets as a piecewise constant flow through the router. The mean scan rate is proportional to the number of scanning hosts inside the AS, i.e. the number of infected hosts in the AS. The (twoAStraffic.dml) example uses the same two AS scenario, but using initializer code that assigns 70% of all susceptibles to one of the ASes, and the remaining 30% to the other AS. Thus, all susceptibles are assigned to the two ASes that are modeled at the network level. (It also uses the CRv2 based removal function.) The DML clip below illustrates how the traffic model is invoked using the Traffic attribute.

  worm_model [     stratified_on  true                # stratified model (by AS)
    as_graph       ex_as_topology.adj  # AS level topology file (adjacencies)

    Epidemic [
      s_0     359999   # number of susceptible hosts (initially) =N-1
      i_0          1   # number of infected hosts (initially)
      beta  1.235e-9   # infection parameter =(1.6/3600)/N

      initializer SSF.App.Worm.test.TestWormEpidemicInitializer
      gamma_function SSF.App.Worm.CodeRedGammaFunction # removal process
    ]
    Traffic [
      use SSF.App.Worm.MeanRateWormTraffic
    ]

    debug true
  ]

...

Figure 6 shows the resulting scan traffic rates at the two gateway routers together with the global epidemic progression. Since the egress scan rate is proportional to the number of infected hosts inside the AS, the difference in rates will correspond to the fraction of susceptibles assigned to each AS.



Figure 6: Example of scan traffic through gateways at two different ASes.

Revision History

v 0.5.1: Bugfixes:
  • Infection parameters for stratified model.
  • Triggering of infections at network level.
  • Calculation of infections in deterministic stratified model.
  • Subpopulation bounds in stratified model.
  • Makefile, IPSpaceDistr and CR2SuscFracDistr targets missing
  • v 0.5: Complete redesign. More flexible (DML configurable), and extensible.
    Added:
  • Stratified population
  • Flexible initializers - Code Red II initializer example
  • Flexible removal fctn - Code Red v2 removals
  • Simple scan traffic model - mean rate example
  • Regression tests and more examples
  • Stochastic model
  • RNG seeding based on global SSFNet seed
  • v 0.4: First external release.

    Design Notes

    The package design makes heavy use of the strategy pattern [Gamma et al., 2000] to allow substitution of algorithms, implementations and initialization code and hence also make the models extensible. Thus, the core class MacroscopicModel defers implementations of submodels to abstract base classes WormEpidemic and WormTraffic. Subclasses of these provide specific models/implementations, such as DeterministicWormEpidemic and MeanRateWormTraffic respectively. Moreover, the WormEpidemic model defers initialization of the model state to the WormEpidemicInitializer class, another abstract base class. Thus, it's possible to substitute in some other initalization code to change the distributions generating the initial state of the model. Similarly, GammaFunction is an abstract base class for functions used to compute removals in the deterministic epidemic model.

    The choice of implementations can thus be done through DML configuration, avoiding recompilation similarly to other aspects of SSFNet.

    Regression Tests

    The tests supplied in the test subdirectory do the following:
    single_host Models single vulnerable host at the microscopic level and a homogeneous population of 359,000 susceptible hosts at the macroscopic level. Used the deterministic epidemic model to model a worm infection that starts from a single host and spreads throughout the population. At some point the host at the microscopic level is infected.
    Tests the spread dynamics of the deterministic model in a homogeneous population.
    two_as_traffic Similar to single_host test, but uses the stochastic model for the epidemic.
    Tests the spread dynamics of the stochastic model in a homogeneous population.
    two_as_single_host Models a stratified population (broken down per AS) with all susceptibles allocated to two specific ASes (that each contain a single vulnerable host modeled at the microscopic level). Otherwise similar to single_host test.
    Tests the spread dynamics of a stratified population.
    two_as_traffic Simple model of egress scan traffic from two ASes. Otherwise similar to two_as_single_host.
    Tests that the traffic observed from the 'mean rate egress scan traffic' model is as expected.

    Related Documentation

    References For more information, please see:

    Author

    SSF.App.Worm has been written and is maintained by Michael Liljenstam, ISTS, Dartmouth College <mili@ists.dartmouth.edu>.