(rev. 01/16/2008) 
 
Notes On Chapter Twenty-Five
-- TCP: Reliable Transport Service
 
-  25.1 Introduction  
     
     -  This chapter reviews services provided by TCP and shows 
           
          how TCP gets reliable data delivery by using (unreliable) IP.
	  
     
 
 -  25.2 The Need For Reliable Transport 
     
     -  For the "convenience" of application code, 
	  TCP/IP software must provide reliable network transport. 
	  
      -   If not, then each application program would be
	  responsible  for doing things like sending retransmissions
	  and checking for duplicate packets.
     
 
 
 -  25.3 The Transmission Control Protocol  
     
     -   TCP is  the  transport level protocol
	  for providing reliability. 
     
 
 
 -  25.4 The Service TCP Provides To Applications  
      
     -   TCP provides a service analogous to telephone
	  service: 
          
 
          -  
               Connection Orientation 
          
 -  Point-To-Point Communication (A TCP connection has
	       exactly two endpoints)
     
          
 -  
	       Complete Reliability ("guaranteed" in-order delivery)
     
          
 -  Full Duplex Communication (supports simultaneous
	       communication in both directions, with input and
	       output buffering at both ends.)
          
 -  
	       Stream Interface (applications can send any number of bytes
	       across the connection at any time.  No worries about having to
	       send packet-sized pieces, and no guarantee that data will be
	       delivered in the same sized pieces that were sent.) 
          
 -  Reliable Connection Startup (free from replay
	       problems)
          
 -  
	       Graceful Connection Shutdown (a host can shutdown
	       a connection any time and TCP guarantees to
	       deliver the data that is still "in the pipeline.")
          
 
      
 -  25.5 End-To-End Service And Datagrams  
     
     -  TCP allows a  process  (a running program) on one host to
	  establish a   virtual connection  
	  with a process on another host.  
      -  
           TCP uses IP software as a  virtual network.
	    
      -  IP in turn uses the physical network for transport.    
      -  TCP operates at a level at which it has (and needs)  no
	  knowledge  of the underlying physical network.
     
 
 
 -  25.6 Achieving Reliability  
     
     -   The main reliability problems that TCP
	  has to deal with are are: 
	  
	  -  packet loss, 
	  
 -  packet duplication, 
	  
 -  packet delay, 
	  
 -  replay, 
	  
 -  out-of-order packets, and
	  
 -  host crash/reboot.
	  
 
      
 
 -  25.7 Packet Loss And Retransmission  
      
     -  
          The  main tools for achieving reliable transport
	   are   retransmission   and
	   acknowledgment.   
      -  Problem: How long should TCP wait for an acknowledgement before
          retransmitting?  
      -  If TCP waits too long,  it wastes time
	  and data throughput suffers.  
      -   If TCP does not wait long enough,  then
	  it wastes bandwidth sending unnecessary duplicate packets.  This too
	  tends to reduce throughput since there is a limited amount of
	  bandwidth being shared by all hosts communicating over a link.
	  
      -   The time-out period has to be a function of the
	   path delay   between the sender and the receiver.
	  Roughly speaking, TCP should wait as long as it takes for an ACK,
	  but not much longer.  
      -   Path delay is not constant.  It varies
	  according to the medium, the distance, and the amount of congestion.
	  
      -  Delay due to congestion is very  volatile  -- it can increase
	  or decrease by a factor of 10 in a few thousandths of a second.
     
 
 
 -  25.8 Adaptive Retransmission  
     
     -   TCP monitors path delay and varies the timeout
	  duration accordingly.   
      -  Current timeout is a linear combination of current average round
	  trip delay and variance of that delay.  
      -  This timeout "statistic" is believed to do a good job making TCP
	  appropriately responsive to changing delays.  
      -  The general topic of Internet congestion control is an area of
          active research.
     
 
 -  25.9 Comparison Of Retransmission Times  
     
     -  
          General Time-out Goal:  Wait only long enough to
	  become 'pretty sure' that the packet has been lost. 
      
 
 -  25.10 Buffers, Flow Control, And Windows  
     
     -  
          Receiver advertises how much space remains in its in-buffer.  The
	  amount is "the window."  
      -  Receiver advertises size of remaining input buffer space along with
	  every ACK it sends.  
      -  If the sender is too fast, the receiver will soon advertise a 
	  zero window.  Sender will then stop sending until it gets a
	  positive window advertisement.
     
 
 -  25.11 Three-Way Handshake 
 
     
     -   To establish a TCP connection  hosts
	  exchange a total of  three datagrams  in
	  accordance with the rules of the TCP protocol.  
      -   This interchange is called a  three-way
	  handshake: 
	  
          
 
          -  The host that wants to establish the connection begins by
	       sending a packet ("Can we talk"), 
          
 -  then the intended recipient sends a packet back (either "Sure, go ahead" or "No, we can't"),
	        
          
 -  and then the first host replies with a third packet ("OK") 
          
 
 
      -  TCP is used to make sure that these packets arrive reliably.
          
      -  
          When host terminate a connection they also use a three-way
	  handshake: 
          
          -   "I'm going to hang up now."  
          
 -   "OK, goodbye.", 
          
 -  "Goodbye."  
          
 
  
      -  
          RFC 793 explains that there is a  unique
	  initial sequence number  and that 
	  every octet is numbered up from that base.  
           Here is some language from RFC 793 that
	  helps explain how sequence numbers are chosen and used: 
          "The sequence number of the first octet of data in a segment is
	  transmitted with that segment and is called the segment sequence
	  number.  Segments also carry an acknowledgment number which is the
	  sequence number of the next expected data octet of transmissions in
	  the reverse direction. ... " 
          The sequence number field of a segment contains "... [t]he sequence
	  number of the first data octet in this segment (except when SYN is
	  present). If SYN is present the sequence number is the initial
	  sequence number (ISN) and the first data octet is ISN+1. ..."
	  
          "... To avoid confusion we must prevent segments from one
	  incarnation of a connection from being used while the same sequence
	  numbers may still be present in the network from an earlier
	  incarnation. We want to assure this, even if a TCP crashes and loses
	  all knowledge of the sequence numbers it has been using. When new
	  connections are created, an initial sequence number (ISN) generator
	  is employed which selects a new 32-bit ISN. The generator is bound
	  to a (possibly fictitious) 32-bit clock whose low order bit is
	  incremented roughly every 4 microseconds. Thus, the ISN cycles
	  approximately every 4.55 hours. Since we assume that segments will
	  stay in the network no more than the Maximum Segment Lifetime (MSL)
	  and that the MSL is less than 4.55 hours we can reasonably assume
	  that ISN's will be unique."  
          See also CERT
	  Advisory CA-2001-09 Statistical Weaknesses in TCP/IP Initial
	  Sequence Numbers for more information on how TCP software
	  generates initial sequence numbers, and the associated security
	  problems. 
           
      -  Language in the RFC explains that  the handshake
	  that establishes a TCP connection  goes like this:
          
	  -  A --> B  SYN my sequence number is X
          
 -  A <-- B  ACK your sequence number is X
	  
 -  A <-- B  SYN my sequence number is Y
	  
 -  A --> B  ACK your sequence number is Y
	  
 
      -  Since steps 2 and 3 above can be combined, we get a three-way
          handshake that assigns both a "send" sequence number and a
	  "receive" sequence number.
     
 
 
 -  25.12 Congestion Control  
     
     -  Problem:  Congestion causes delay, which
	  triggers retransmission, which can lead to  worse 
	  congestion!  
      -  TCP avoids excessive retransmission.  
      -   When a packet timer expires, TCP sends one
	  packet.  If it receives an ACK in time, it sends two packets.  If
	  these are ACK'd then TCP sends four more packets.  
      -  TCP ramps up exponentially as indicated above until it sends a group
	  of W/2 packets, where W is the current window size.  It  then cuts the rate of increase  and slowly
	  tries to ramp up to sending W packets at a time.  
      -  This  quick back off and (relatively) slow
	  startup tends to alleviate congestion  while still avoiding
	  wasting excessive bandwidth during the ramp-up period.  
	  
      -  
	  RFC 5033
	  summarizes current thinking on Internet congestion control and best
	  practices for handling congestion.  See also
	  
	  RFC 2914.
     
 
 
 -  25.13 TCP Segment Format  
     
     -  There's just one kind of TCP 'packet', regardless of whether it is
	  used to open a connection, close one, carry data, or carry an
	  acknowledgment.  
      -  TCP packets are called segments (not to be confused with
	   fragments).  
      -  
          The layout of the TCP segment is shown at right.  
      -   All in one segment, TCP can send: 
          
	  -  outgoing data,
	  
 -  an ACK for incoming data, and
          
 -  a window advertisement for the size of the input buffer.
	  
 
  
      -  This can make it a little  tricky to understand
	  the layout  of the segment, because some fields relate to the
	  outgoing data stream, while others relate to the data stream coming
	  in from the opposite direction.  
      -  If X is sending a segment to Y, X uses the ACKNOWLEDGEMENT NUMBER
	  and WINDOW to let Y know how much data X has received from Y, and
	  how much buffer space X has left to store data coming in from Y.
	  
      -  The ACKNOWLEDGEMENT NUMBER is the sequence number of the next
	  in-order (incoming) segment X expects from Y.  
      -  The SEQUENCE NUMBER field of the segment contains the sequence
	  number of the first octet of (outgoing) data in the segment.
	  
	  
      
  
 -  25.14 Summary