chapter 23 -- support protocols and technologies

(rev. May 01, 2015)

Notes On Chapter Twenty-Three -- Support Protocols and Technologies

23.0 Study Guide
- Understand what is meant by the term: address resolution
- Understand why and when a host on the Internet needs to perform address resolution.
- Understand the ARP protocol. Be able to explain the steps that unfold when a host uses ARP, for example, be able to explain the steps in section 23.4 below.
- Know the rules in section 23.7 that hosts use regarding the caching of ARP information. As an example of the kind of question you should be able to answer, when a host receives an ARP broadcast, how does it decide whether or not to cache the address-binding of the sender?
- Know that ARP messages are encapsulated directly inside hardware frames.
- Know what ICMP stands for and what it is. What are some of the more well-known ICMP message types?
- Know that ICMP messages are encapsulated inside IP datagrams, which in turn are encapsulated inside hardware frames.
- Know what DHCP stands for and what it is used for.
- Know what NAT and NAPT stand for and what they do.
- Understand the difference between a NAT router and an Internet router.
- Understand that hosts behind a NAT router have private IP addresses (non-routable IP addresses).
23.1 Introduction
- Address Binding
- Error Reporting
- Bootstrapping
- Address Translation
23.2 Address Resolution
- In order to send an IP datagram over a real physical network, it's necessary to encapsulate it in a physical frame, and it is necessary to find out the physical (MAC) destination address and place it in the header of the physical frame.
- Often when host X wants to send an IP datagram to host Y on the same network, X knows the IP address of Y, but not the MAC address of Y.
- Discovery of the needed MAC address is called address resolution.
- This problem comes up each time that an Internet datagram is forwarded to the next hop toward it's ultimate destination.
- The problem comes up any time a host X wants to send an IP datagram to a host Y on the same network. It's the same problem, whether or not X is a router, and whether or not Y is a router.
- However, a host on the Internet never needs to resolve MAC addresses of hosts that are not on the same network.
23.3 An Example of IPv4 Addresses
- As diagram 23.2 below illustrates, when X sends an IP datagram to R₁, the datagram must travel inside a network frame that has (MAC) source address 3A-12-C9 and (MAC) destination address 59-61-33.
- Similarly an IP datagram forwarded from R₁ to R₂ must travel inside a network frame with (MAC) source address 97-27-D3 and (MAC) destination address 8E-1A-7F.
23.4 The IPv4 Address Resolution Protocol (ARP)
- Under IPv4, the Address Resolution Protocol (ARP) is the most frequently used means of translating IP addresses into MAC addresses.
- The idea of ARP:
  - Host X knows the IP address of host Y.
  - Let's say the IP address of host Y is 130.17.70.83.
  - Host X needs to know the MAC address of host Y.
  - Host X broadcasts a message on the network to which it is directly connected saying: "Please help me find the MAC address corresponding to 130.17.70.83"
  - The broadcast is forwarded through hubs, bridges and switches.
  - Host Y receives a copy of the broadcast and replies directly to X: "Oh, I am 130.17.70.83. My MAC address is 0:3:ba:16:bc:c0."
23.5 ARP Message Format
- ARP was designed so it could be used to perform many different kinds of address-translations.
- However, the reality is that ARP is used almost exclusively to obtain Ethernet addresses that correspond to IP addresses.
- The Fields of an ARP Message:
  - HARDWARE ADDRESS TYPE: type of hardware address used, e.g. 1 for Ethernet
  - PROTOCOL ADDRESS TYPE: type of protocol address used - e.g. 0x0800 for IPv4
  - HADDR LEN: tells how many octets in a hardware address
  - PADDR LEN: tells how many octets in a protocol address
  - OPERATION: tells whether this packet is a query or a response
  - SENDER HADDR: sender's hardware address
  - SENDER PADDR: sender's protocol address
  - TARGET HADDR: target's hardware address
  - TARGET PADDR: target's protocol address
- Be careful to understand the use of the TARGET fields.
- In an ARP query, when X is asking for the MAC address of Y, Y is the target.
- In the ARP response that Y sends to tell X Y's MAC address, X is the target.
23.6 ARP Encapsulation
- An ARP query or reply travels on a physical network encapsulated in a physical frame.
- Typically the frame has a type field the sender uses to mark the frame
- When an ARP query or reply travels in an Ethernet frame, the Ethernet header type field is set by the sender to 0x806 to denote an ARP message.
- The receiver has to look at the OPERATION field in the ARP message to figure out whether it is a query or response.
- Note that this is a case where both the frame header and its payload (the ARP message) contain fields for hardware addresses.
23.7 ARP Caching and Message Processing
- ARP includes measures to help optimize performance and resource utilization.
- In particular, these measures help insure that not too much network bandwidth is used up by ARP requests or responses.
- Computers keep caches of address bindings - not indefinitely but until they expire or are replaced.
- If a computer has a valid binding in its ARP cache, it won't broadcast an ARP request for that binding.
- When host X broadcasts an ARP request for the MAC address of host Y:
  - Y puts X's protocol (e.g. IP) address and MAC address in its ARP cache.
  - Any other host on the local network, if it already has an ARP cache entry for X, updates the entry with the MAC address for X contained in the ARP request.
- The effect of the caching described above is that hosts that may soon need to send a packet to X get the required address bindings.
- On the other hand, hosts that aren't likely to need the bindings don't cache them. They have limited space in their ARP caches. Choosing not to cache some things helps prevent new but useless information from crowding out old but valuable information in APR caches.
- Of course, if Y is the target of an ARP request, it sends an ARP response to X containing Y's MAC address.
23.8 The Conceptual Address Boundary
- ARP is a conceptual boundary in the five-layer TCP/IP reference model.
- We can think of ARP as if it were at the top of the Link Layer (the network interface layer).
- Above the level of ARP, protocol software uses IP addresses.
- Below the level of ARP protocol software and network hardware use MAC addresses.
23.9 Internet Control Message Protocol (ICMP)
- ICMP is used to report errors or information back to the sender of a datagram.
- There are different versions of ICMP for IPv4 and IPv6.
- IP and ICMP rely on each other - IP is used to transmit ICMP messages, and IP depends on ICMP for reporting IP errors.
- ICMP does not include "Checksum Error" messages because in that case the source address can not be trusted. Packets with checksum errors are simply discarded by the receiver.
- ICMP is also used to obtain information - e.g. echo requests.
- ICMP Destination Unreachable messages inform a sender that no route could be found to the intended recipient.
23.10 ICMP Message Format and Encapsulation
- ICMP messages are sent inside IP datagrams - as payloads of IP datagrams.
- ICMPv4 messages are always encapsulated inside IPv4 datagrams, and ICMPv6 messages are always encapsulated inside IPv6 datagrams.
- TCP/IP protocol does NOT call for sending an error message regarding a problem with the sending of an ICMP error message. This insures that there will not be network congestion caused by error messages about error messages.
23.11IPv6 Address Binding With Neighbor Discovery
- Instead of ARP, IPv6 uses IPv6 Neighbor Discovery (IPv6-ND)
- IPv6-ND uses ICMPv6 messages.
- IPv6 does not support broadcast as such, but IPv6-ND utilizes a multicast address to which all nodes on the network must listen.
- A host uses IPv6-ND messages to query all hosts on the network for address resolution information.
- Each host maintains a table that is analogous to an ARP cache.
23.12 Protocol Software, Parameters and Configuration
- Basically human managers put files of information on disk that allows booting routers to learn the IP addresses of their interfaces, and to initialize their forwarding tables.
- Typically the TCP/IP protocol software that runs on a host is designed to work when installed on any host on most any network.
- This has advantages, but it means that certain "blanks must be filled in" (parameters must be set) when hosts boot.
- The parameters have to do with attributes of the host and the network, for example host IP address, network mask, and address of a local DNS server.
23.13 Dynamic Host Configuration Protocol (DHCP)
- A booting computer X can use Reverse Address Resolution Protocol (RARP). X broadcasts a RARP request and obtains its IP address from a server.
- Similarly X can broadcast ICMP Address Mask Request and Router Discovery messages.
- Bootstrap Protocol (BOOTP) was invented to give booting computers a mechanism to broadcast one request and receive IP number, address mask and default router IP. BOOTP uses IP directly - all 1's for the destination and all 0's for the source address. The responding server unicasts back to the requester, using as destination address the MAC source address from the request.
- BOOTP was designed to provide service to hosts that were permanently installed on the local network. Network administrators configured the BOOTP server with a table that determined which IP address to assign to which host. If a host wasn't on the list, there was no provision for assigning it an IP number.
- Dynamic Host Configuration Protocol (DHCP) took the BOOTP idea a step further. It allows a new unknown computer to join a network and be assigned an IP address from a pool of addresses maintained for that purpose.
- DHCP works like BOOTP. It's basically an extension of BOOTP.
- DHCP can provide a permanent addresses in the manner of BOOTP, or an on-demand address from a pool.
- On-demand DHCP addresses are actually just leased for a limited time, and hosts have to get an extension if they want to keep them longer.
23.14 DHCP Protocol Operation and Optimizations
- Recovery from loss or duplication - if no response, host retransmits DHCP request. If there is a duplicate response from a server, the host ignores it.
- Caching of a server address - host caches server's address after using a DHCP Discover message to find a DHCP server. This helps make lease renewal efficient.
- Avoidance of synchronized flooding - (avoids, for example, what might occur if all hosts reboot after a power failure) - Hosts must delay a random time before transmitting a DHCP request, or retransmitting.
23.15 DHCP Message Format
- Various fields exist for client request and server response.
- In addition to information types mentioned previously, a host can use DHCP to request the location of a boot file, which it can then download with, say TFTP.
23.16 Indirect DHCP Server Access Through a Relay
- It's typical now to have setups where DHCP relay agents forward DHCP requests and replies across subnet routers to a centralized DHCP server.
23.17 IPv6 Autoconfiguration
- A host does not need a protocol such as DHCP in order to obtain a unique IPv6 address.
- A booting host can do a multicast to all nodes on the network to obtain the network prefix. (If this fails, it can use a special value reserved for 'local communication'.)
- For the host part of its IPv6 address, a host can just use its (unique) 48-bit MAC address to generate a 64-bit IPv6 host suffix, according to a standard algorithm. The uniqueness of the MAC address and the nature of the algorithm assures that the host suffix is unique.
23.18 Network Address Translation (NAT)
- NAT is a technology that allows all computers in a network to share a single IPv4 address.
- It was developed to deal with the scarcity of IPv4 addresses.
- On the local network, hosts use separate, unique IP addresses and operate just as any host on the Internet would.
- However to hosts on the external Internet, all hosts on the local NAT network appear to be just one single host.
- Typically the device providing the NAT service is a wireless access point or home network 'router'.
23.19 NAT Operation and IPv4 Private Addresses
- All packet traffic between the external Internet and the local network passes through the NAT device.
- The NAT device has a "real" globally-valid IP address.
- The hosts on the local network have IP addresses uniquely assigned from a special family of private addresses (aka non-routable addresses)
- When a host on the local network sends an IP datagram to the external Internet, the NAT router modifies the datagram, substituting its own globally-valid IP address for the non-routable source address.
- When a host on the external Internet receives a datagram from a host on the local network, it appears to have come from the NAT device. So naturally, if the external host sends a reply, it send it to the NAT device.
- When the NAT device receives the reply from the external host, it replaces the destination address in the datagram with the original non-routable address of the local host that initialized the interaction.
- This doesn't work at all unless the NAT keeps track of which hosts on the local network have sent packets to which hosts on the external Internet. Why? Every incoming datagram has the same destination address. However if NAT keeps track of who has been sending to whom, NAT may be able to route datagrams to the local hosts based on the SOURCE addresses in the incoming datagrams.
23.20 Transport-Layer NAT (NAPT)
- The NAT "hacks" referred to in the previous section don't work if two hosts J and K on the local network try to communicate at the same time with the same host X on the external Internet. When the NAT router gets a datagram from X, there's nothing to indicate whether the datagram should be addressed to J or K.
- Most people who think they have NAT actually have NAPT - Network Address and Port Translation.
- When a host J on the local network sends an IP datagram to host X on the external Internet, the NAPT router modifies the datagram, not only substituting its own globally-valid IP address for the non-routable source address of J, but also changing the source PORT number to some unique value that the NAPT router can remember is associated with host J.
- If a different host K on the local network sends datagrams to X, NAPT will substitute a DIFFERENT PORT NUMBER.
- Later when a datagram arrives from X, NAPT can figure out from the combination of source IP address and destination port number whether the datagram should be forwarded to J or K. The port numbers will be different, even though the source IP address will be the same.
23.21 NAT and Servers
- Since NAT and NAPT rely on building a translation table based on outgoing traffic, it doesn't support a local network that has multiple servers. For example if a client tries to connect on port 80 from the outside, how can NAPT know which local webserver should get the datagrams?
- There's a variant of NAT, called "Twice NAT" that is somewhat helpful.
- If the client in the external network contacts the DNS server at the local site to translate the domain name of the server, the DNS server will interact with the NAT or NAPT system, which will create a table entry that allows the client to reach the desired server.
- However THAT hack doesn't work if the client uses the IP address of the server directly or if it uses a proxy DNS server.
23.22 NAT Software and Systems for Use at Home
- NAT/NAPT is used in residences and small businesses as a way of running a network while sharing a single IP address.
- It may be cheaper to purchase and operate a "NAT router" than to purchase additional IP addresses from an ISP.