Scalable Simulation Framework
SSF.OS.TCP
Implementation and Validation Tests

Contents:

SSF implementation of TCP standards

This page contains a detailed survey of the RFC requirements, and states how they are implemented in SSF TCP.
 To select a TCP variant from a DML network configuration, see SSF TCP variants: generic Tahoe, generic Reno, delayed acks. See the SSF.OS.TCP design notes for an overview of the source code organization. There is also a simple TCP tutorial, written as a student project.

SSF TCP validation tests

A suite of SSF TCP validation tests; each test page contains the test description and the analysis of plots of tcpdump data and TCP state variables for compliance with RFCs. The majority of tests are adapted from the suite developed by Sally Floyd and her collaborators for the ns-2 simulator.

SSF.OS.TCP latest source code release

About this document

SSF implementation of TCP standards

Contents:

0. TCP header
1. TCP parameters and their initial values
2. TCP clocks and timers
3. RTT measurement and RTO calculation
4. Send window management & ACK processing
5. Receive window management & ACK generation
6. Packet loss identification and retransmission management
7. Opening a TCP connection (session)
8. Closing a TCP connection
9. TCP Connection States

General:

SSF TCP is intended for modeling bulk data transfers, such as for modeling of the Web traffic. It currently does not provide special processing for very small segments, nor the PUSH and URGENT processing.

The design of SSF TCP conforms to the "plug-and-play" architecture provided by the protocol design framework SSF.OS. SSF TCP is fairly modular, and per-host configurable from a DML network configuration database. It should be relatively easy to add new processing modules or change the existing ones for modeling additional TCP features or variants.

References:

S94, W. Stevens, TCP/IP Illustrated vol.1: The Protocols, Addison-Wesley,1994.
WS95, G. Wright and W. Stevens, TCP/IP Illustrated vol.2: The Implementation, Addison-Wesley,1995
RFC 793, J. Postel, Transmission Control Protocol, September 1981.
RFC 1122, R. Braden, Requirements for Internet Hosts -- Communication Layers, October 1989.
RFC 2581, M. Allman, V. Paxson and W. Stevens, TCP Congestion Control, April 1999.

0. TCP header

RFC requirements implemented in SSF TCP
SOURCE_port, DEST_port, SEQno, ACKno, AdvertisedWnd
flags: SYN, ACK, FIN
nominal header length counted as 20 bytes
RFC requirements NOT implemented in SSF TCP
flags: URG, PSH, RST
TCP checksum, Urgent pointer, options

1. TCP parameters and their initial values.
RFC requirements / common choices SSF implementation
ISN (Initial Send Sequence Number) ISN is DML-configurable with default value 0,
ISS  0
The default initial send sequence number increment ISS_INCR = 280000.
The maximum value of sequence number is 2**63-1, with NO wrap-around.

SMSS (Sender Maximum Segment Size): size of the largest segment the sender can transmit.

RMSS (Receiver Maximum Segment Size): size of the largest segment the receiver is willing to accept. Specified in the MSS option during connection startup.

SMSS and RMSS do not include TCP/IP headers and options. If MSS option not used, SMSS = 536 bytes in RFC 1122; 1024 bytes in common implementations.

SSF TCP does not implement the MSS option, and SMSS = RMSS = MSS. MSS is DML-configurable with default value of 1024 bytes.
MSS  1024
Send buffer size (in bytes) Send Buffer Size is DML-configurable in units of MSS, with default value of 16 (16*MSS bytes)
SendBufferSize  16
The send buffer is implemented as a linked list of pseudo-data segments.
Receive buffer size (in bytes) Receive buffer size is DML-configurable in units of MSS, with default value of 16 (16*MSS bytes)
RcvWndSize  16
The Receive Buffer is implemented as a circular array of pseudo-data segments.
Transmission control sender session state variables, in bytes (RFC 2581):
  • rwnd latest advertised receiver window.
  • cwnd state variable (congestion window). Initial value = IW.
  • ssthresh state variable (slow start threshold). Determines whether sending state is Slow Start or Congestion Avoidance. Initial value may be arbitrarily high.
  • flight_size amount of data sent but not ACKed. Used to update ssthresh.
Certain variables are DML-configurable as follows:
  • rwnd (in SSF, AdvertisedWnd).
  • cwnd (in SSF, conWnd); initial value = IW; maximum value DML-configurable, default value 64*MSS:
    MaxConWnd 64
    
  • ssthresh (in SSF, conThresh); initial value = 64*MSS, default 65,536 bytes.
  • flight_size is computed as min(cwnd, rwnd) following WS95.
IW (sender's initial congestion window after 3-way handshake). RFC2581: less than or equal to 2 * SMSS, no more than two segments. IW = MSS.

LW (Loss Window = congestion window size after retransmission timeout) is equal to SMSS (RFC2581).

RW (Restart Window, congestion window size after TCP connection restart from an idle period) is equal to IW (RFC2581).

LW = RW = IW = MSS.
Maximum retransmission timeout shift (The maximum number of retransmission attempts before a TCP connection gives up) is configurable with default value of 12, per [WS95]) Implemented as the maximum number of retransmission attempts, DML-configurable with default value 12:
MaxRexmitTimes  12
TTL (Time to Live) for IP layer to send TCP packet: It's 60 in RFC793 and changed to be configurable in RFC1122. Default TTL is set to 20 in SSF.OS.IP. Will be made DML-configurable in future releases.

2. TCP clocks and timers

The default values of all timers are from reference [WS95].

RFC requirements / common choices SSF implementation

Most TCP implementations use two clocks (tick counters) driven by the operating system; they are used to advance a number of TCP timers. Typically the slow clock advances in steps of 500 ms, and the fast clock advances in steps of 200 ms.

There is one instance of each of the 7 timers listed below per TCP session (connection endpoint).

Both slow and fast clocks are DML-configurable with default step values in seconds
TCP_SLOW_INTERVAL  0.500 
TCP_FAST_INTERVAL  0.200 
Each value can be set with accuracy of 0.001 s (1 ms). Class tcpSessionMaster maintains private clocks in each host instance; their initial phases are chosen randomly, independently of other hosts. All tcpSessions (connection endpoints) in a host share these clocks, but of course each session's timers are independent.
Connection establishment timer: starts when a SYN packet is sent. If a response is not received within 75 seconds, the connection is aborted. In SSF TCP, SYN packet is treated the same as a data packet, and it uses the retransmission timer. That may be changed in a future release.
Retransmission timer: The value of this timer (retransmission timeout, RTO) is calculated dynamically, based on round-trip time (RTT) measurements. The SSF implementation follows [WS95]: When a data segment is sent, start the retransmission timer with the current RTO value, unless the timer is already running. When an ACK is received: if the ACK is for the last segment sent (no data in flight), cancel the timer, else restart the timer with the current RTO value. When a data segment is retransmitted, start the retransmission timer with twice the latest RTO value, or with maxRTO, whichever is smaller.

The initial value of RTO is set to 3 seconds. RTO is bounded between 1 and 64 seconds (WS95).

Delayed ACK timer is set when TCP receives data that must be acknowledged, but need not be acknowledged immediately. Instead, TCP receiver may wait up to 200 ms (TCP_FAST_INTERVAL) before sending an ACK (according to WS95; RFC 2581 allows 500 ms delay). If during this 200-ms period, TCP has data to send on this connection, the pending acknowledgment is sent along with the data (piggybacking). In SSF TCP, delayed-ack processing is optional, and can be selected from DML via
delayed_ack true
The delay value can be selected from DML via TCP_FAST_INTERVAL.
Persist timer: is set when a TCP session advertises a receive window of size 0, preventing the other end from sending data. In SSF TCP, it is assumed that the receiver consumes immediately all received data. The persist timer is not implemented.
Keepalive timer: fires if connection is idle for 2 hours. In SSF TCP, the keepalive timer is not implemented.
FIN_WAIT_2 timer: is set to prevent a connection from staying in the FIN_WAIT_2 state forever. This timer is set to 10 minutes when the session enters the FIN_WAIT_2 state. When the FIN_WAIT_2 timer fires it is reset to 75 seconds. When it fires again, the connection is dropped. In SSF TCP, FIN_WAIT_2 timer is set to maximum idle time. The parameter MaxIdleTime is DML-configurable with default value of 600 seconds.
MaxIdleTime 600
TIME_WAIT timer: also called 2MSL timer, is set on entry to closing TIME_WAIT state. When TCP performs active close and sends final ACK, that connection must stay in the TIME_WAIT state for time up to 2*MSL. In SSF TCP, 2MSL is DML-configurable with default value of 120 seconds (RFC 793):
MSL 60

3. RTT estimation and RTO calculation
RFC requirements

RFC 793: Measure the elapsed time between sending a data octet with a particular sequence number and receiving an ACK that covers that sequence number (segments sent do not have to match segments ACKed).

RFCs do not specify when to perform RTT measurements, nor on which segments (except for exclusion of retransmissions, cf. Karn's algorithm), but do specify how to compute RTO given an RTT measurement.

Computation of RTO (RFC 1122): A host TCP MUST implement Karn's algorithm and Jacobson's algorithm. The following values SHOULD be used to initialize the estimation parameters for a new connection:

  • RTT = 0 seconds.
  • RTO = 3 seconds.
  • The smoothed variance is to be initialized to the value that will result in this RTO.

The lower RTO bound SHOULD be measured in fractions of a second, the upper bound should be 2*MSL, i.e., 240 seconds.

SSF TCP implementation

The RTT measurement algorithm and RTO calculation algorithm are adapted from S94 (pp 301-303) and WS95 (pp 836-847).

Since RTT measurement algorithm description is confusing, here's a concise summary:

  • Measure RTT for one segment at a time. Save SN of the timed segment.
  • On sending data:
    • When sending a new segment:
      • If RTT measurement is already running, do not re-start measurement.
      • If RTT measurement is not running, start measurement for this segment.
    • When retransmitting a segment:
      • If RTT measurement is already running, cancel the measurement.
      • If RTT measurement is not running, do not start measurement.
  • On receiving an ACK:
    • If RTT measurement is running for a given SN "sn", end the measurement upon receiving the first ACK that covers this SN "sn".
    • Count the number of "slow ticks" between measurement start and end. Measured RTT is defined as this count plus one.
  • As soon as the measurement is completed, recompute RTO.
Given a valid RTT, the RTO computation with Jacobson's algorithms is implemented a la WS95. The initial values are:
  • initial RTT: rtt = 0 seconds.
  • initial smoothed RTT: srtt = 0 seconds.
  • initial rttVar = 1.5 seconds, initial RTO = 3 seconds.
RTO is bounded between 1 and 64 seconds (per WS95).
  • TCP_RTT_SHIFT = 3, used as 1/8 factor when calculating srtt.
  • TCP_RTTVAR_SHIFT = 2, used as 1/4 factor when calculating rttVar.

4. Send window management & ACK processing

General rule: when an ACK is received, first update all affected state variables, then send the full usable window of segments.

Send Sequence Space (in bytes):

1 - SNs sent and acknowledged: < snd_una
2 - SNs of sent but unacknowledged data: [snd_una, snd_max - 1]
3 - SNs allowed for new data transmission : [snd_max, snd_una + snd_wnd - 1]
4 - future SNs, not yet allowed for transmission
      1        2              3          4

    ------][---------][--------------][--------- increasing SN
            |          |               |
            |          |               |
         snd_una    snd_max     snd_una+snd_wnd

Terminology:

SNsequence number
snd_una oldest unacknowledged sequence number
snd_max highest sequence number sent. If snd_max = snd_una, all SNs sent were ACKed, and the sender is idle.
snd_nxt next sequence number to be sent (equal to snd_max or snd_una)
snd_wnd min(cwnd, rwnd)
seg_ack acknowledgment from the remote TCP session (next in-sequence SN rcv_nxt expected by the remote receiver)
seg_seq first sequence number of a segment
seg_len the number of data bytes in the segment

seg_seq + seg_len - 1 = last sequence number of a segment

RFC requirements/common choices implemented in SSF TCP
Adopted from WS95.

Acceptable ACK numbers always satisfy:

snd_una =< seg_ack =< snd_max
(original definitions in RFC 793 are obsoleted). For a retransmitted segment:
snd_nxt < snd_max
because when sending new data, snd_nxt = snd_max; but for a retransmission snd_nxt = snd_una. It's possible to have seg_ack > snd_nxt because of packet reordering.

In SSF TCP, the usable window definition below holds both for new data transmission and retransmission. The usable window size (current amount of data that may be sent) is:

D = snd_una + snd_wnd - snd_nxt
The rule for updating snd_wnd depends on the current sending state, such as Slow Start, Congestion Avoidance, etc., and on the TCP variant.

TCP session sending states

In this section we consider only the connection ESTABLISHED session state. SSF TCP follows requirements from RFC 2581 and WS95; and in cases of differences between them, the implemented choice is noted.

It is convenient to restate the four classical congestion control algorithms and the corresponding ACK processing rules in terms of TCP session sending states.

In a TCP session, the sending states are:

  1. Slow Start
  2. Congestion Avoidance
  3. Fast Retransmission (Tahoe only)
  4. Fast Retransmission/Recovery (Reno only)
  5. idle

A sending state is defined in terms of the current values of ssthresh and duplicate ACK counter. A transition from one sending state to another may take place on either of the following events:

  • Arrival of a valid ACK
  • TCP session is idle and socket session pushes a message to send
  • retransmission timer fires

Initial sending state: when a TCP session enters the ESTABLISHED state after completing the 3-way hanshake, its sending state is Slow Start with initial values of ssthresh and cwnd = IW.

For each sending state, if the retransmission timer fires, reset ssthresh and cwnd:

ssthtresh = max (flight_size/2, 2*SMSS)
     cwnd = LW = SMSS
execute RTT etc. updates, transition to Slow Start, and send the segment with SN snd_una. Following WS95, SSF TCP uses
flight_size = snd_wnd = min(cnwd, rwnd)

Note: In SSF TCP the value of duplicate ACK threshold is set to 3 (RFC 2581). "Get 3 dup ACKs" means receive 4 consecutive, identical ACKs without any other intervening packets in-between.

Identification of a duplicate ACK:

A received TCP packet is a dup ACK if all of the following apply:

  • ACKno = snd_una
  • seg_len = 0
  • AdvertisedWnd is the same as last received (not an update)
  • There are un-acked data in flight (retransmit timer is running)

If a received TCP packet does not satify all of the above tests, reset dup ACK counter to zero.

Slow Start (all variants):

receive a new data ACK
  • if cnwd =< ssthresh
    1. cwnd += SMSS
    2. send the updated usable window of data.
  • else if cnwd > ssthresh, transition to Congestion Avoidance ACK processing
receive a duplicate ACK
  • if the number of consecutive duplicate ACKs is less than 3, do nothing
  • else if the number of consecutive duplicate ACKs is equal to 3, transition to FastRexmit in Fast Retransmission (Tahoe) or Fast Retransmission/Recovery (Reno) state.

Congestion Avoidance (all variants):

receive a new data ACK
  1. cwnd += SMSS*SMSS/cwnd
  2. send the updated usable window of data.
receive a duplicate ACK
  • if the number of consecutive duplicate ACKs is less than 3, do nothing
  • else if the number of consecutive duplicate ACKs is equal to 3, transition to FastRexmit in Fast Retransmission (Tahoe) or Fast Retransmission/Recovery (Reno) state

Note: In SSF TCP Reno the extra additive term floor(MSS/8) when increasing the congestion window:

cwnd += SMSS*SMSS/cwnd + MSS/8

may be used or be commented out. It's use in BSD Reno implementation is considered a bug, and is ruled out in RFC 2581.

Fast Retransmit (generic Tahoe):

FastRexmit
  1. Reset ssthresh and cwnd:
    ssthtresh = max (flight_size/2, 2*SMSS)
         cwnd = SMSS
     
  2. retransmit one segment with SN given by 3 dup ACKs
Receive a duplicate ACK do nothing
Receive a new data ACK transition to Slow Start new data ACK processing.

Note: To execute Fast Retransmit, snd_nxt = snd_una. When exiting Fast Retransmit and entering Slow Start, there are two possible choices for the value of snd_nxt (not specified in RFCs):

  1. Set snd_nxt = snd_una + seg_len, then the fast retransmission trace is similar to a timeout-caused retransmission trace.
  2. Return to the value of snd_nxt immediately prior to Fast Retransmit.

SSF TCP Tahoe implements choice 1 for agreement with the ns-2 implementation.

Fast Retransmit/Fast Recovery (generic Reno):

FastRexmit
  1. Reset ssthresh:
    ssthtresh = max (flight_size/2, 2*SMSS)
  2. retransmit one segment with SN given by 3 dup ACKs
  3. Set cwnd = ssthresh + 3*SMSS
Receive a duplicate ACK
  • cwnd += SMSS
  • send next segment if allowed by new min(cwnd, rwnd)
Receive a new data ACK
  1. Set cwnd = ssthresh
  2. Transition to Congestion Avoidance for new ACK processing

Note 1: RFC 2581 and S94, p. 312 state that a new data ACK in the above table "should be the ACK of the retransmission [FastRexmit], ... Additionally, this ACK should acknowledge all the intermediate segments sent between the lost packet and the receipt of the first duplicate ACK." However, such a test is not made in the WS95 source code, and we don't implement it either.

Note 2: To execute Fast Retransmit, snd_nxt = snd_una. When transiting from Fast Retransmit to Fast Recovery phase, there are two possible choices for the next value of snd_nxt (not specified in RFCs):

  1. Set snd_nxt = snd_una + seg_len.
  2. Return to the value of snd_nxt immediately prior to Fast Retransmit.

SSF TCP Reno implements choice 2 for agreement with the BSD implementation in WS95.

Note 3: A new data packet can be sent only when the usable window size satisfies D > 0 (in SSF TCP only full-sized data segments are sent, thus D >= MSS), implying the condition:

 min(cwnd, rwnd) - (snd_nxt - snd_una) >= MSS
 

After Fast Retransmission snd_nxt returns to the immediately preceding value, usually snd_nxt = snd_max. Therefore, due to the reduction of cwnd after Fast Retransmission, the usable window size D may become zero or negative, preventing packet transmission during Fast Recovery until enough dup ACKs are received to open the window.

Note 4: A modification of the Fast Retransmit/Fast Recovery is defined in RFC 2582 and called NewReno. It deals much better with multiple losses per window. NewReno is currently not implemented in SSF TCP.

RFC requirements NOT implemented in SSF TCP
In SSF TCP only full sized packet can be sent. If usable window D < MSS, no packet is sent. (Will be sent when more data is available).

5. Receive window management & ACK generation
Receive Sequence Space (in bytes, WS95 p. 809):
1 - sequence numbers received and acknowledged
2 - sequence numbers allowed for new reception
3 - future sequence numbers not yet allowed
      1                 2                3

    ------][-----------------------][--------- increasing SN
            |                        |
            |                        |
         rcv_nxt               rcv_adv = rcv_nxt+rcv_wnd

Terminology:

SNsequence number
rcv_nxt sequence number expected on next incoming segment
rcv_nxt + rcv_wnd -1 highest sequence number expected on an incoming segment
seg_seq first sequence number in incoming segment
seg_seq + seg_len - 1last sequence number in incoming segment

A segment is judged to occupy a portion of valid receive sequence space if either of the following two conditions is true:

   rcv_nxt =< seg_seq < rcv_nxt + rcv_wnd

   rcv_nxt =< seg_seq + seg_len - 1 < rcv_nxt + rcv_wnd
RFC requirements implemented in SSF TCP

General rule: when a segment is received, first update all affected state variables, then may generate an ACK.

ACK generation requirements (RFC 1122, 2581):

  1. Cumulative ACKs: Receiver always sends acknowledgment for the highest sequence number received in-sequence. In the ACK packet the acknowledgment number is set to rcv_nxt (the next expected sequence number in-order), and the the sequence number is set to snd_nxt. The advertised window sent to the other side is rcv_wnd.
  2. When data is received with a sequence number outside of valid receive sequence space, the data will be discarded, and an ACK will be sent immediately.
  3. ACK should be generated for at least every second SMSS-sized segment.
  4. ACK must be generated no later than 500 ms after arival of first un-acked segment.
  5. Receiver must not generate more than one ACK per incoming segment, except if it updates rcv_wnd
  6. When data received is valid but out of order, the segment will be buffered and reordered. A duplicate ACK should be sent immediately.
  7. When a received segment fills a gap in sequence (all or a part of gap), ACK should be generated immediately.
  8. rcv_adv should be non-decreasing (RFC 793).

SSF TCP implementation: When the delayed-ack option is not set, SSF TCP generates an ACK for every segment received; and immediately after segment reception.

Delayed-ack option

In SSF TCP the delayed-ack option is DML-configurable, and can be selected in tcpinit with:

delayed_ack true
  • When delayed-ack option is true, the ACK for in-sequence data can be delayed at most 200 ms (fast clock granularity). During this period, more than one arriving segment may be summarily acknowledged in a single ACK, and if sender has data to send, the pending acknowledgment may be sent along with the data (piggybacking). (In RFC1122, ACK can be delayed at most 500 ms).

SSF TCP implementation: Delayed ACKs are timed by a fast clock with DML-configurable period TCP_FAST_INTERVAL (default 200 ms). The following is implemented:

  • When data is received out of order, a duplicate ACK is generated immediately.
  • When 2 full-sized packets are received, an ACK is generated immediately.
RFC requirements NOT implemented in SSF TCP
Avoidance of Silly Window Syndrome (SWS): avoid advancing the right receive window edge rcv_nxt + rcv_wnd in small increments.

In SSF TCP the SWS algorithm is not implemented. SSF TCP sends only full-sized packet (rcv_nxt increases by MSS), and all received data are immediately consumed by the receiver (rcv_wnd = receiver buffer size).

6. Packet loss identification and retransmission management
RFC requirements implemented in SSF TCP

Possible packet loss is identified at a sender either by a segment retransmission timer timeout, or by receipt of 3 consecutive duplicate acknowledgments.

This section repeats some of the rules presented in other sections.

Retransmission timer timeout:

The following sequence of steps is taken when a retransmission timeout occurs. It includes the Karn's algorithm.

  1. The oldest unacknowledged data segment will be retransmitted: snd_nxt is set to snd_una
  2. Increment by one the retransmission shift count; if the value exceeds 12 (maximum retransmission shift), drop the connection. In SSF TCP, each segment (SendItem) in the SendQueue maintains a variable counting the number of times this segment was retransmitted.
  3. Set RTO to 2 times its previous value, but no more than maxRTO.
  4. If a segment has been already retransmitted more than 3 times (one-fourth of the max number of retransmission attempts), smoothed RTT (srtt) is assumed worthless, and to force the use of measured RTT at a next opportunity, set srtt to zero while cleverly increasing rttVar so that RTO will have the same value as in 3. above in case a retransmission timeout occurs again (WS95 p. 843).
  5. If RTT measurement is running for some previously sent segment, cancel it (set RTT measurement counter to 0).
  6. The counter of duplicate ACKs is set to 0.
  7. Reset ssthresh and cwnd:
    ssthtresh = max (flight_size/2, 2*MSS)
         cwnd = LW = MSS
    
    where, following WS95, SSF TCP uses
    flight_size = min(cnwd, rwnd)
  8. Retransmit the oldest unacknowledged data (snd_nxt = snd_una). Do not start the RTT measurement.

Duplicate ACKs

In SSF TCP the value of duplicate ACK threshold is set to 3 (RFC 2581).

Processing of duplicate ACKs depends on the sending state, see the section "Send window (sequence space) management".

Identification of a duplicate ACK:

A received TCP packet is a dup ACK if all of the following apply:

  • ACKno = snd_una
  • seg_len = 0
  • AdvertisedWnd is the same as last received (not an update)
  • There are un-acked data in flight (retransmit timer is running)

If a received TCP packet does not satify all of the above tests, reset dup ACK counter to zero.

7. Opening a TCP connection (session).
RFC requirements implemented in SSF TCP
Sequence number synchronization (3-way handshake):
  1. Server side passively opens a tcpSession listening on the given port when the simulator starts (init()).
  2. When client side actively opens a connection it sends a SYN segment.
  3. The server side tcpSessionMaster will first check whether a passive open tcpSession listening on the required port number exists, or else the packet is dropped. If it exists, it creates a new tcpSession in SYN_RECEIVED state to communicate with the client, and the server side tcpSession will send SYN+ACK as reply.
  4. The client side tcpSession will send an ACK after receiving SYN+ACK and change the state from SYN_SENT to ESTABLISHED.
  5. After the server side tcpSession receives the ACK, it changes its state to ESTABLISHED. The 3-way handshake is completed.
Simultaneous open:
  1. Both sides send a SYN and open the sender part of connection.
  2. Both sides receive a SYN at state SYN_SENT.
  3. Both sizes send the SYN + ACK and change connection state to SYN_RECEIVED.
  4. Both sizes receive the SYN+ACK and change state to ESTABLISHED.

8. Closing a connection
RFC requirements implemented in SSF TCP
In SSF TCP the closing process is compliant with the TCP state transition diagram in reference [WS95], which is slightly different from RFC793.

9. TCP Connection States
RFC requirements implemented in SSF TCP

SSF TCP implements all states and transitions. It has complete opening and closing phases including half-closing.

LISTEN waiting for a connection request from any remote TCP and port.
SYN-SENT waiting for a matching connection request after having sent a connection request.
SYN-RECEIVED waiting for a confirming connection request acknowledgment after having both received and sent a connection request.
ESTABLISHED an open connection, data received can be delivered to the user. The normal state for the data transfer phase of the connection.
FIN-WAIT-1 waiting for a connection termination request from the remote TCP, or an acknowledgment of the connection termination request previously sent.
FIN-WAIT-2 waiting for a connection termination request from the remote TCP.
CLOSE-WAIT waiting for a connection termination request from the local user.
CLOSING waiting for a connection termination request acknowledgment from the remote TCP.
LAST-ACK waiting for an acknowledgment of the connection termination request previously sent to the remote TCP (which includes an acknowledgment of its connection termination request).
TIME-WAIT waiting for enough time to pass to be sure the remote TCP received the acknowledgment of its connection termination request.
CLOSED no connection state.

SSF Implementation of TCP Variants

Distinct TCP variants can be selected from DML by setting the appropriate option attributes.

SSF.OS.TCP currently supports two TCP option attributes, delayed_ack and fast_recovery.

To select TCP Tahoe from DML, in tcpinit use:

fast_recovery false

To select TCP Reno from DML, in tcpinit use:

fast_recovery true

Both in Tahoe and Reno the delayed ACK option can be selected in tcpinit with

delayed_ack true

A note about the meaning of "TCP variants"

SSF TCP uses the names "Tahoe" and "Reno" in the sense of generic behavior, not implying that the behavior is identical to the original BSD TCP implementations known under these names.

Tahoe includes Slow Start, Congestion Avoidance, Fast Retransmission, but not Fast Recovery.

Reno extends Tahoe by the addition of Fast Recovery. The extra additive term floor(MSS/8) when increasing congestion window during congestion avoidance may be used or be commented out. It's use in TCP implementations is considered a bug, and is ruled out in RFC 2581.


About this document

The SSF TCP pages and content created by Hongbo Liu and Andy Ogielski with partial support from AT&T Labs-Research.
Last updated June 6, 2000.
Entire Contents Copyright 1998, 1999, 2000 SSF Research Network. All rights reserved.