(rev. 01/16/2008)
Notes On Chapter Twenty-Five
-- TCP: Reliable Transport Service
- 25.1 Introduction
- This chapter reviews services provided by TCP and shows
how TCP gets reliable data delivery by using (unreliable) IP.
- 25.2 The Need For Reliable Transport
- For the "convenience" of application code,
TCP/IP software must provide reliable network transport.
- If not, then each application program would be
responsible for doing things like sending retransmissions
and checking for duplicate packets.
- 25.3 The Transmission Control Protocol
- TCP is the transport level protocol
for providing reliability.
- 25.4 The Service TCP Provides To Applications
- TCP provides a service analogous to telephone
service:
-
Connection Orientation
- Point-To-Point Communication (A TCP connection has
exactly two endpoints)
-
Complete Reliability ("guaranteed" in-order delivery)
- Full Duplex Communication (supports simultaneous
communication in both directions, with input and
output buffering at both ends.)
-
Stream Interface (applications can send any number of bytes
across the connection at any time. No worries about having to
send packet-sized pieces, and no guarantee that data will be
delivered in the same sized pieces that were sent.)
- Reliable Connection Startup (free from replay
problems)
-
Graceful Connection Shutdown (a host can shutdown
a connection any time and TCP guarantees to
deliver the data that is still "in the pipeline.")
- 25.5 End-To-End Service And Datagrams
- TCP allows a process (a running program) on one host to
establish a virtual connection
with a process on another host.
-
TCP uses IP software as a virtual network.
- IP in turn uses the physical network for transport.
- TCP operates at a level at which it has (and needs) no
knowledge of the underlying physical network.
- 25.6 Achieving Reliability
- The main reliability problems that TCP
has to deal with are are:
- packet loss,
- packet duplication,
- packet delay,
- replay,
- out-of-order packets, and
- host crash/reboot.
- 25.7 Packet Loss And Retransmission
-
The main tools for achieving reliable transport
are retransmission and
acknowledgment.
- Problem: How long should TCP wait for an acknowledgement before
retransmitting?
- If TCP waits too long, it wastes time
and data throughput suffers.
- If TCP does not wait long enough, then
it wastes bandwidth sending unnecessary duplicate packets. This too
tends to reduce throughput since there is a limited amount of
bandwidth being shared by all hosts communicating over a link.
- The time-out period has to be a function of the
path delay between the sender and the receiver.
Roughly speaking, TCP should wait as long as it takes for an ACK,
but not much longer.
- Path delay is not constant. It varies
according to the medium, the distance, and the amount of congestion.
- Delay due to congestion is very volatile -- it can increase
or decrease by a factor of 10 in a few thousandths of a second.
- 25.8 Adaptive Retransmission
- TCP monitors path delay and varies the timeout
duration accordingly.
- Current timeout is a linear combination of current average round
trip delay and variance of that delay.
- This timeout "statistic" is believed to do a good job making TCP
appropriately responsive to changing delays.
- The general topic of Internet congestion control is an area of
active research.
- 25.9 Comparison Of Retransmission Times
-
General Time-out Goal: Wait only long enough to
become 'pretty sure' that the packet has been lost.
- 25.10 Buffers, Flow Control, And Windows
-
Receiver advertises how much space remains in its in-buffer. The
amount is "the window."
- Receiver advertises size of remaining input buffer space along with
every ACK it sends.
- If the sender is too fast, the receiver will soon advertise a
zero window. Sender will then stop sending until it gets a
positive window advertisement.
- 25.11 Three-Way Handshake
- To establish a TCP connection hosts
exchange a total of three datagrams in
accordance with the rules of the TCP protocol.
- This interchange is called a three-way
handshake:
- The host that wants to establish the connection begins by
sending a packet ("Can we talk"),
- then the intended recipient sends a packet back (either "Sure, go ahead" or "No, we can't"),
- and then the first host replies with a third packet ("OK")
- TCP is used to make sure that these packets arrive reliably.
-
When host terminate a connection they also use a three-way
handshake:
- "I'm going to hang up now."
- "OK, goodbye.",
- "Goodbye."
-
RFC 793 explains that there is a unique
initial sequence number and that
every octet is numbered up from that base.
Here is some language from RFC 793 that
helps explain how sequence numbers are chosen and used:
"The sequence number of the first octet of data in a segment is
transmitted with that segment and is called the segment sequence
number. Segments also carry an acknowledgment number which is the
sequence number of the next expected data octet of transmissions in
the reverse direction. ... "
The sequence number field of a segment contains "... [t]he sequence
number of the first data octet in this segment (except when SYN is
present). If SYN is present the sequence number is the initial
sequence number (ISN) and the first data octet is ISN+1. ..."
"... To avoid confusion we must prevent segments from one
incarnation of a connection from being used while the same sequence
numbers may still be present in the network from an earlier
incarnation. We want to assure this, even if a TCP crashes and loses
all knowledge of the sequence numbers it has been using. When new
connections are created, an initial sequence number (ISN) generator
is employed which selects a new 32-bit ISN. The generator is bound
to a (possibly fictitious) 32-bit clock whose low order bit is
incremented roughly every 4 microseconds. Thus, the ISN cycles
approximately every 4.55 hours. Since we assume that segments will
stay in the network no more than the Maximum Segment Lifetime (MSL)
and that the MSL is less than 4.55 hours we can reasonably assume
that ISN's will be unique."
See also CERT
Advisory CA-2001-09 Statistical Weaknesses in TCP/IP Initial
Sequence Numbers for more information on how TCP software
generates initial sequence numbers, and the associated security
problems.
- Language in the RFC explains that the handshake
that establishes a TCP connection goes like this:
- A --> B SYN my sequence number is X
- A <-- B ACK your sequence number is X
- A <-- B SYN my sequence number is Y
- A --> B ACK your sequence number is Y
- Since steps 2 and 3 above can be combined, we get a three-way
handshake that assigns both a "send" sequence number and a
"receive" sequence number.
- 25.12 Congestion Control
- Problem: Congestion causes delay, which
triggers retransmission, which can lead to worse
congestion!
- TCP avoids excessive retransmission.
- When a packet timer expires, TCP sends one
packet. If it receives an ACK in time, it sends two packets. If
these are ACK'd then TCP sends four more packets.
- TCP ramps up exponentially as indicated above until it sends a group
of W/2 packets, where W is the current window size. It then cuts the rate of increase and slowly
tries to ramp up to sending W packets at a time.
- This quick back off and (relatively) slow
startup tends to alleviate congestion while still avoiding
wasting excessive bandwidth during the ramp-up period.
-
RFC 5033
summarizes current thinking on Internet congestion control and best
practices for handling congestion. See also
RFC 2914.
- 25.13 TCP Segment Format
- There's just one kind of TCP 'packet', regardless of whether it is
used to open a connection, close one, carry data, or carry an
acknowledgment.
- TCP packets are called segments (not to be confused with
fragments).
-
The layout of the TCP segment is shown at right.
- All in one segment, TCP can send:
- outgoing data,
- an ACK for incoming data, and
- a window advertisement for the size of the input buffer.
- This can make it a little tricky to understand
the layout of the segment, because some fields relate to the
outgoing data stream, while others relate to the data stream coming
in from the opposite direction.
- If X is sending a segment to Y, X uses the ACKNOWLEDGEMENT NUMBER
and WINDOW to let Y know how much data X has received from Y, and
how much buffer space X has left to store data coming in from Y.
- The ACKNOWLEDGEMENT NUMBER is the sequence number of the next
in-order (incoming) segment X expects from Y.
- The SEQUENCE NUMBER field of the segment contains the sequence
number of the first octet of (outgoing) data in the segment.
- 25.14 Summary