Transmission Control Protocol

From Citizendium, the Citizens' Compendium
Jump to: navigation, search
This article is developing and not approved.
Main Article
Talk
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and not meant to be cited; by editing it you can help to improve it towards a future approved, citable version. These unapproved articles are subject to a disclaimer.

The fundamental Internet end-to-end protocol for implementing reliable delivery that accounts for packet loss is the Transmission Control Protocol (TCP).[1]

When the Internet was first designed, one of its basic axioms was the end-to-end assumption. Under this assumption, the transfer of information between Internet endpoints is the responsibility of the endpoints. As opposed to other network architectures such as X.25, the Internet proper (i.e., IP) has limited error protection. There are different kinds of errors that can take place in transmission, and TCP will protect against some of them. For some requirements (e.g., Trivial File Transfer Protocol), some of the errors that TCP could correct are corrected at the application layer, because, for the specific use of TCP, the overhead and local resource demand of TCP would be intolerable.

Not all Internet applications need guaranteed delivery, and can be somewhat "lossy." Video on demand over the Internet, for example, can afford to let packets be lost en route to gain a speed advantage. When stronger error control is needed, such control is the responsibility of protocols running above IP in the protocol stack.

Returning to the example of video on demand, video applications cannot tolerate packets arriving out of order, which is a different type of error than individual bits being incorrect. TCP can promise that bytes of data will be in order in which they were transmitted, or the connection will be dropped. TCP does guarantee that as long as the connection stays up, bytes will be free of bit errors.

At a general level, assume that TCP guarantees a stream of error-free bytes. If TCP is unable to correct errors by repeated retransmission, it shuts down the connection. TCP does not guarantee the rate, or the variability of rate, of delivery; see differentiated services.

There is a cost to the error-free guarantee. Since TCP will retransmit PDUs containing errored bits, until either they are received correctly, or some programmed limits are exceeded and the connection is shut down, the delays introduced by retransmission can make end-to-end delay variable and unpredictable. For an application such as voice over internet protocol, highly variable delay makes the application unusable. Since VoIP can tolerate some loss of data better than it can tolerate variable delay, VoIP protocol stacks use User Datagram Protocol (UDP) rather than TCP. VoIP does have some other mechanisms, at higher protocol levels, to deal with certain errors.

TCP assumes that any loss is due to congestion, so it strives not to transmit more data than the network and destination can accept. This is done with flow control mechanisms, which also control retransmission of data lost in transmission.

Segment fields

TCP's protocol data unit (PDU) is called a segment, which runs from the first byte of the header to the last byte of the data in the payload. A segment may be split up into smaller packets, but the IP fragmentation mechanism will guarantee that if all the packets making up the segment are delivered, the receiving IP code will put them into the correct order before notifying the receiving TCP that the entire segment has been delivered.

0                   1                   2                   3     
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1   
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  
|          Source Port          |       Destination Port        | 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Sequence Number                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Acknowledgment Number                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Data |           |U|A|P|R|S|F|                               |
| Offset| Reserved  |R|C|S|S|Y|I|            Window             |
|       |           |G|K|H|T|N|N|                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Checksum            |         Urgent Pointer        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Options                    |    Padding    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                             data                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Common IETF protocols, in their specifications, have an assigned port number. For example, the Hypertext Transfer Protocol used for web servers has bee assigned port 80, so when going to the address (hypothetically, 198.0.2.1) to which www.citizendium.org maps, the destination port field will contain the value of decimal 80. The source port is randomly selected by your web browser, so it can keep track of different HTTP over TCP sessions.

The basic sequence and acknowledgement numbers are 32 bit, and the window size field is 16 bits.

Connection establishment

TCP's basic mechanism for establishing a connection (i.e., the OPEN process) is called a three-way handshake. This explanation will strt with minimal view of events; various additional performance-related things occur in practice.

To request a connection, the computer desiring to connect sends a segment with the SYN flag set to 1 (i.e., binary TRUE). If the computer receiving the request agrees that it wants to connect, it sends a segment of its own, also with SYN=1. A basic implementation will reserve resources for that connection, a reasonable thing to do that has been exploited in some attacks on TCP.

If the original computer agrees with the proposed parameter of the connection, some of which might have been proposed by the other end, it sends the third part of the handshake: a segment with both the SYN and ACK flags set to 1. The two computers can

There are a number of variants on the connection establishment mechanism. Some may propose nonstandard initial values for sequence number and window size, both of which are used for error and flow control. There is also a variant called a passive OPEN, in which a computer preannounces its agreement to acept

Sequence numbering

Many of TCP's functions depend on the sequence numbering mechanism; many of the limitations of the original protocols come from the sequence numbering mechanism. Some of the enhancements center around an effective extension of this field, and there are some attacks on TCP where the miscreant successfully predicts the next sequence number.

Both ends define the initial sequence number for segments they will send, so there are independent sequence number spaces in each direction of transmission. The sequence numbers reflect the number of bytes transmitted, not the number of segments sent. When the maximum sequence number possible in the sequence number field is reached, different things may happen depending on implementation detail. The numbers may "wrap", so if the maximum value was 99 and five bytes were sent, the new send sequence number would be 99 plus 5, modulo 100, so the new number would be 4. Alternatively, the computer might stop sending until other mechanisms "catch up".

Windowing

There are both implicit and explicit flow control mechanisms. In the conventional slow start mode, TCP starts with a window size of one byte. As long as the transmission stays up, and delay does not exceed certain parameters, TCP keeps doubling the window size until either reaches the 64K limit imposed by the 16-bit window size field in TCP (without high performance enhancements), or a transmission is not acknowledged.

If there is no acknowledgement, TCP assumes that is due to congestion, although TCP really does not know if the problem is congestion or a transmission error. In either case, TCP sets the window back to 1 and starts increasing the window until it hits a limit. Individual TCP implementations may, for local reasons, limit the maximum window size, but this is nor part of the standard.

TCP acceleration

There are methods of WAN acceleration or TCP acceleration that may, in properly selected circumstances, improve performance, by initially using a large window size. For some environments, this may be effective, but it may break other ones; see TCP acceleration.

A good application for a large starting window is on a router-to-router link, when it is known that the first protocol that comes up will be Border Gateway Protocol, and the other router is transferring a full routing table. Until the routing table is transferred and the internal forwarding plane table converges, no other traffic will be competing with BGP. After routing converges, then normal dynamic window adjustment makes sense for regular flow; the implementation might restrict the large initial window to BGP at startup.[2]

TCP over paths with specific performance characteristics

TCP was intended to be independent of the underlying transmission system, as is the Internet Protocol (IP).

Demonstrated independence of transmission medium

Indeed, IP's independence of the underlying medium has been demonstrated in some extreme cases[3] [4] [5]

When consistency is more important than maximum throughput

In certain applications, such as voice or video over the Internet, a consistent delay value is more important to the user experience than occasionally bursting for maximum throughput. [6] This can be done with the TCP Friendly Rate Control (TFRC) document simply specifies a congestion control mechanism. TFRC is not a new protocol, but a TCP implementation technique that would be appropriate for systems using mechanisms such as the Real Time Transport Protocol [7], applications that manage congestion at the application level, or in the endpoint-wide congestion control features of endpoints with a common congestion control policy.

High performance extensions

Nevertheless, experience demonstrated that the original TCP design limited transfer rates over high-speed, long-delay paths, such as relays through geosynchronous communications satellites. Such channels have been called "Long Fat Networks", pronounced "elephant". [8]

The concern is with the product of bandwidth and latency (i.e., delay in RFC1323). If its value becomes too large, the original sizes of the TCP window size and sequence numbers are too small; transmission has to stop until enough data is acknowledged to reuse those fields. Extensions in RFC1323 allow the window size to scale, and add timestamps to disambiguate sequence numbers.

Another problem is that traditional TCP, when retransmitting, will send all packets that were not acknowledged at when the error was detected. With high data rates, large packets, and low error rates, this can result in throughput reduction while the packets are retransmitted, and analyzed and acknowledged by the receiver. By using a feature called selective acknowledgement, only the actually errored packets need be retransmitted.[9]

Header compression

When there is very little change between TCP segments carrying a particular application protocol, such as telnet sending one character at a time and thus always incrementing counters by one, the header can be compressed. Header compression requires more processing, but can save significant bandwidth on slow links. Telnet gives the most dramatic results, but header compression can also be helpful for FTP and other protocols with an inherent order in their payloads. [10]

References

  1. Postel, J. (September 1981), Transmission Control Protocol, Internet Engineering Task Force, RFC0793
  2. Allman M., Floyd S., Partridge C. (October 2002), Increasing TCP's Initial Window, Internet Engineering Task Force, RFC3390
  3. Waitzman, D. (April 1 1990), Standard for the transmission of IP datagrams on avian carriers, Internet Engineering Task Force, RFC1149
  4. Waitzman, D. (April 1 1999), IP over Avian Carriers with Quality of Service, Internet Engineering Task Force, RFC2549
  5. Bergen Linux Users Group (April 28 2001, 12:00), The highly unofficial CPIP WG
  6. Handley, M. et al. (January 2003), TCP Friendly Rate Control (TFRC): Protocol Specification, Internet Engineering Task Force, RFC3448
  7. Schulzrinne, H.; S. Casner & R. Frederick et al. (July 2003), RTP: A Transport Protocol for Real-Time Applications, Internet Engineering Task Force, RFC3550
  8. Jacobson, V.; R. Braden & D Borman (May 1992), TCP Extensions for High Performance, Internet Engineering Task Force, RFC1323
  9. Mathis, M.; J. Mahdavi & S. Floyd et al. (October 1996), TCP Selective Acknowledgment Options, Internet Engineering Task Force, RFC2018
  10. Jacobson, V. (February 1990), Compressing TCP/IP Headers for Low-Speed Serial Links, Internet Engineering Task Force, RFC1144