GENWiki

Premier IT Outsourcing and Support Services within the UK

User Tools

Site Tools


rfc:rfc2923

Network Working Group K. Lahey Request for Comments: 2923 dotRocket, Inc. Category: Informational September 2000

                TCP Problems with Path MTU Discovery

Status of this Memo

 This memo provides information for the Internet community.  It does
 not specify an Internet standard of any kind.  Distribution of this
 memo is unlimited.

Copyright Notice

 Copyright (C) The Internet Society (2000).  All Rights Reserved.

Abstract

 This memo catalogs several known Transmission Control Protocol (TCP)
 implementation problems dealing with Path Maximum Transmission Unit
 Discovery (PMTUD), including the long-standing black hole problem,
 stretch acknowlegements (ACKs) due to confusion between Maximum
 Segment Size (MSS) and segment size, and MSS advertisement based on
 PMTU.

1. Introduction

 This memo catalogs several known TCP implementation problems dealing
 with Path MTU Discovery [RFC1191], including the long-standing black
 hole problem, stretch ACKs due to confusion between MSS and segment
 size, and MSS advertisement based on PMTU.  The goal in doing so is
 to improve conditions in the existing Internet by enhancing the
 quality of current TCP/IP implementations.
 While Path MTU Discovery (PMTUD) can be used with any upper-layer
 protocol, it is most commonly used by TCP;  this document does not
 attempt to treat problems encountered by other upper-layer protocols.
 Path MTU Discovery for IPv6 [RFC1981] treats only IPv6-dependent
 issues, but not the TCP issues brought up in this document.
 Each problem is defined as follows:
 Name of Problem
    The name associated with the problem.  In this memo, the name is
    given as a subsection heading.

Lahey Informational [Page 1] RFC 2923 TCP Problems with Path MTU Discovery September 2000

 Classification
    One or more problem categories for which the problem is
    classified:  "congestion control", "performance", "reliability",
    "non-interoperation -- connectivity failure".
 Description
    A definition of the problem, succinct but including necessary
    background material.
 Significance
    A brief summary of the sorts of environments for which the problem
    is significant.
 Implications
    Why the problem is viewed as a problem.
 Relevant RFCs
    The RFCs defining the TCP specification with which the problem
    conflicts.  These RFCs often qualify behavior using terms such as
    MUST, SHOULD, MAY, and others written capitalized.  See RFC 2119
    for the exact interpretation of these terms.
 Trace file demonstrating the problem
    One or more ASCII trace files demonstrating the problem, if
    applicable.
 Trace file demonstrating correct behavior
    One or more examples of how correct behavior appears in a trace,
    if applicable.
 References
    References that further discuss the problem.
 How to detect
    How to test an implementation to see if it exhibits the problem.
    This discussion may include difficulties and subtleties associated
    with causing the problem to manifest itself, and with interpreting
    traces to detect the presence of the problem (if applicable).
 How to fix
    For known causes of the problem, how to correct the
    implementation.

Lahey Informational [Page 2] RFC 2923 TCP Problems with Path MTU Discovery September 2000

2. Known implementation problems

2.1.

 Name of Problem
    Black Hole Detection
 Classification
    Non-interoperation -- connectivity failure
 Description
    A host performs Path MTU Discovery by sending out as large a
    packet as possible, with the Don't Fragment (DF) bit set in the IP
    header.  If the packet is too large for a router to forward on to
    a particular link, the router must send an ICMP Destination
    Unreachable -- Fragmentation Needed message to the source address.
    The host then adjusts the packet size based on the ICMP message.
    As was pointed out in [RFC1435], routers don't always do this
    correctly -- many routers fail to send the ICMP messages, for a
    variety of reasons ranging from kernel bugs to configuration
    problems.  Firewalls are often misconfigured to suppress all ICMP
    messages.  IPsec [RFC2401] and IP-in-IP [RFC2003] tunnels
    shouldn't cause these sorts of problems, if the implementations
    follow the advice in the appropriate documents.
    PMTUD, as documented in [RFC1191], fails when the appropriate ICMP
    messages are not received by the originating host.  The upper-
    layer protocol continues to try to send large packets and, without
    the ICMP messages, never discovers that it needs to reduce the
    size of those packets.  Its packets are disappearing into a PMTUD
    black hole.
 Significance
    When PMTUD fails due to the lack of ICMP messages, TCP will also
    completely fail under some conditions.
 Implications
    This failure is especially difficult to debug, as pings and some
    interactive TCP connections to the destination host work.  Bulk
    transfers fail with the first large packet and the connection
    eventually times out.
    These situations can almost always be blamed on a misconfiguration
    within the network, which should be corrected.  However it seems
    inappropriate for some TCP implementations to suffer

Lahey Informational [Page 3] RFC 2923 TCP Problems with Path MTU Discovery September 2000

    interoperability failures over paths which do not affect other TCP
    implementations (i.e. those without PMTUD).  This creates a market
    disincentive for deploying TCP implementation with PMTUD enabled.
 Relevant RFCs
    RFC 1191 describes Path MTU Discovery.  RFC 1435 provides an early
    description of these sorts of problems.
 Trace file demonstrating the problem
    Made using tcpdump [Jacobson89] recording at an intermediate host.
    20:12:11.951321 A > B: S 1748427200:1748427200(0)
         win 49152 <mss 1460>
    20:12:11.951829 B > A: S 1001927984:1001927984(0)
         ack 1748427201 win 16384 <mss 65240>
    20:12:11.955230 A > B: . ack 1 win 49152 (DF)
    20:12:11.959099 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:12:13.139074 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:12:16.188685 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:12:22.290483 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:12:34.491856 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:12:58.896405 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:13:47.703184 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:14:52.780640 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:15:57.856037 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:17:02.932431 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:18:08.009337 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:19:13.090521 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:20:18.168066 A > B: . 1:1461(1460) ack 1 win 49152 (DF)
    20:21:23.242761 A > B: R 1461:1461(0) ack 1 win 49152 (DF)
    The short SYN packet has no trouble traversing the network, due to
    its small size.  Similarly, ICMP echo packets used to diagnose
    connectivity problems will succeed.
    Large data packets fail to traverse the network.  Eventually the
    connection times out.  This can be especially confusing when the
    application starts out with a very small write, which succeeds,
    following up with many large writes, which then fail.
 Trace file demonstrating correct behavior
    Made using tcpdump recording at an intermediate host.
    16:48:42.659115 A > B: S 271394446:271394446(0)
         win 8192 <mss 1460> (DF)
    16:48:42.672279 B > A: S 2837734676:2837734676(0)
         ack 271394447 win 16384 <mss 65240>

Lahey Informational [Page 4] RFC 2923 TCP Problems with Path MTU Discovery September 2000

    16:48:42.676890 A > B: . ack 1 win 8760 (DF)
    16:48:42.870574 A > B: . 1:1461(1460) ack 1 win 8760 (DF)
    16:48:42.871799 A > B: . 1461:2921(1460) ack 1 win 8760 (DF)
    16:48:45.786814 A > B: . 1:1461(1460) ack 1 win 8760 (DF)
    16:48:51.794676 A > B: . 1:1461(1460) ack 1 win 8760 (DF)
    16:49:03.808912 A > B: . 1:537(536) ack 1 win 8760
    16:49:04.016476 B > A: . ack 537 win 16384
    16:49:04.021245 A > B: . 537:1073(536) ack 1 win 8760
    16:49:04.021697 A > B: . 1073:1609(536) ack 1 win 8760
    16:49:04.120694 B > A: . ack 1609 win 16384
    16:49:04.126142 A > B: . 1609:2145(536) ack 1 win 8760
    In this case, the sender sees four packets fail to traverse the
    network (using a two-packet initial send window) and turns off
    PMTUD.  All subsequent packets have the DF flag turned off, and
    the size set to the default value of 536 [RFC1122].
 References
    This problem has been discussed extensively on the tcp-impl
    mailing list;  the name "black hole" has been in use for many
    years.
 How to detect
    This shows up as a TCP connection which hangs (fails to make
    progress) until closed by timeout (this often manifests itself as
    a connection that connects and starts to transfer, then eventually
    terminates after 15 minutes with zero bytes transfered).  This is
    particularly annoying with an application like ftp, which will
    work perfectly while it uses small packets for control
    information, and then fail on bulk transfers.
    A series of ICMP echo packets will show that the two end hosts are
    still capable of passing packets,  a series of MTU-sized ICMP echo
    packets will show some fragmentation, and a series of MTU-sized
    ICMP echo packets with DF set will fail.  This can be confusing
    for network engineers trying to diagnose the problem.
    There are several traceroute implementations that do PMTUD, and
    can demonstrate the problem.
 How to fix
    TCP should notice that the connection is timing out.  After
    several timeouts, TCP should attempt to send smaller packets,
    perhaps turning off the DF flag for each packet.  If this
    succeeds, it should continue to turn off PMTUD for the connection
    for some reasonable period of time, after which it should probe
    again to try to determine if the path has changed.

Lahey Informational [Page 5] RFC 2923 TCP Problems with Path MTU Discovery September 2000

    Note that, under IPv6, there is no DF bit -- it is implicitly on
    at all times.  Fragmentation is not allowed in routers, only at
    the originating host.  Fortunately, the minimum supported MTU for
    IPv6 is 1280 octets, which is significantly larger than the 68
    octet minimum in IPv4.  This should make it more reasonable for
    IPv6 TCP implementations to fall back to 1280 octet packets, when
    IPv4 implementations will probably have to turn off DF to respond
    to black hole detection.
    Ideally, the ICMP black holes should be fixed when they are found.
    If hosts start to implement black hole detection, it may be that
    these problems will go unnoticed and unfixed.  This is especially
    unfortunate, since detection can take several seconds each time,
    and these delays could result in a significant, hidden degradation
    of performance.  Hosts that implement black hole detection should
    probably log detected black holes, so that they can be fixed.

2.2.

 Name of Problem
    Stretch ACK due to PMTUD
 Classification
    Congestion Control / Performance
 Description
    When a naively implemented TCP stack communicates with a PMTUD
    equipped stack, it will try to generate an ACK for every second
    full-sized segment.  If it determines the full-sized segment based
    on the advertised MSS, this can degrade badly in the face of
    PMTUD.
    The PMTU can wind up being a small fraction of the advertised MSS;
    in this case, an ACK would be generated only very infrequently.
 Significance
    Stretch ACKs have a variety of unfortunate effects, more fully
    outlined in [RFC2525].  Most of these have to do with encouraging
    a more bursty connection, due to the infrequent arrival of ACKs.
    They can also impede congestion window growth.
 Implications
    The complete implications of stretch ACKs are outlined in
    [RFC2525].

Lahey Informational [Page 6] RFC 2923 TCP Problems with Path MTU Discovery September 2000

 Relevant RFCs
    RFC 1122 outlines the requirements for frequency of ACK
    generation.  [RFC2581] expands on this and clarifies that delayed
    ACK is a SHOULD, not a MUST.
 Trace file demonstrating it
    Made using tcpdump recording at an intermediate host.  The
    timestamp options from all but the first two packets have been
    removed for clarity.
 18:16:52.976657 A > B: S 3183102292:3183102292(0) win 16384
      <mss 4312,nop,wscale 0,nop,nop,timestamp 12128 0> (DF)
 18:16:52.979580 B > A: S 2022212745:2022212745(0) ack 3183102293 win
      49152 <mss 4312,nop,wscale 1,nop,nop,timestamp 1592957 12128> (DF)
 18:16:52.979738 A > B: . ack 1 win 17248  (DF)
 18:16:52.982473 A > B: . 1:4301(4300) ack 1 win 17248  (DF)
 18:16:52.982557 C > A: icmp: B unreachable -
      need to frag (mtu 1500)! (DF)
 18:16:52.985839 B > A: . ack 1 win 32768  (DF)
 18:16:54.129928 A > B: . 1:1449(1448) ack 1 win 17248  (DF)
      .
      .
      .
 18:16:58.507078 A > B: . 1463941:1465389(1448) ack 1 win 17248  (DF)
 18:16:58.507200 A > B: . 1465389:1466837(1448) ack 1 win 17248  (DF)
 18:16:58.507326 A > B: . 1466837:1468285(1448) ack 1 win 17248  (DF)
 18:16:58.507439 A > B: . 1468285:1469733(1448) ack 1 win 17248  (DF)
 18:16:58.524763 B > A: . ack 1452357 win 32768  (DF)
 18:16:58.524986 B > A: . ack 1461045 win 32768  (DF)
 18:16:58.525138 A > B: . 1469733:1471181(1448) ack 1 win 17248  (DF)
 18:16:58.525268 A > B: . 1471181:1472629(1448) ack 1 win 17248  (DF)
 18:16:58.525393 A > B: . 1472629:1474077(1448) ack 1 win 17248  (DF)
 18:16:58.525516 A > B: . 1474077:1475525(1448) ack 1 win 17248  (DF)
 18:16:58.525642 A > B: . 1475525:1476973(1448) ack 1 win 17248  (DF)
 18:16:58.525766 A > B: . 1476973:1478421(1448) ack 1 win 17248  (DF)
 18:16:58.526063 A > B: . 1478421:1479869(1448) ack 1 win 17248  (DF)
 18:16:58.526187 A > B: . 1479869:1481317(1448) ack 1 win 17248  (DF)
 18:16:58.526310 A > B: . 1481317:1482765(1448) ack 1 win 17248  (DF)
 18:16:58.526432 A > B: . 1482765:1484213(1448) ack 1 win 17248  (DF)
 18:16:58.526561 A > B: . 1484213:1485661(1448) ack 1 win 17248  (DF)
 18:16:58.526671 A > B: . 1485661:1487109(1448) ack 1 win 17248  (DF)
 18:16:58.537944 B > A: . ack 1478421 win 32768  (DF)
 18:16:58.538328 A > B: . 1487109:1488557(1448) ack 1 win 17248  (DF)

Lahey Informational [Page 7] RFC 2923 TCP Problems with Path MTU Discovery September 2000

 Note that the interval between ACKs is significantly larger than two
 times the segment size;  it works out to be almost exactly two times
 the advertised MSS.  This transfer was long enough that it could be
 verified that the stretch ACK was not the result of lost ACK packets.
 Trace file demonstrating correct behavior
 Made using tcpdump recording at an intermediate host.  The timestamp
 options from all but the first two packets have been removed for
 clarity.
 18:13:32.287965 A > B: S 2972697496:2972697496(0)
      win 16384 <mss 4312,nop,wscale 0,nop,nop,timestamp 11326 0> (DF)
 18:13:32.290785 B > A: S 245639054:245639054(0)
      ack 2972697497 win 34496 <mss 4312> (DF)
 18:13:32.290941 A > B: . ack 1 win 17248 (DF)
 18:13:32.293774 A > B: . 1:4313(4312) ack 1 win 17248 (DF)
 18:13:32.293856 C > A: icmp: B unreachable -
      need to frag (mtu 1500)! (DF)
 18:13:33.637338 A > B: . 1:1461(1460) ack 1 win 17248 (DF)
      .
      .
      .
 18:13:35.561691 A > B: . 1514021:1515481(1460) ack 1 win 17248 (DF)
 18:13:35.561814 A > B: . 1515481:1516941(1460) ack 1 win 17248 (DF)
 18:13:35.561938 A > B: . 1516941:1518401(1460) ack 1 win 17248 (DF)
 18:13:35.562059 A > B: . 1518401:1519861(1460) ack 1 win 17248 (DF)
 18:13:35.562174 A > B: . 1519861:1521321(1460) ack 1 win 17248 (DF)
 18:13:35.564008 B > A: . ack 1481901 win 64680 (DF)
 18:13:35.564383 A > B: . 1521321:1522781(1460) ack 1 win 17248 (DF)
 18:13:35.564499 A > B: . 1522781:1524241(1460) ack 1 win 17248 (DF)
 18:13:35.615576 B > A: . ack 1484821 win 64680 (DF)
 18:13:35.615646 B > A: . ack 1487741 win 64680 (DF)
 18:13:35.615716 B > A: . ack 1490661 win 64680 (DF)
 18:13:35.615784 B > A: . ack 1493581 win 64680 (DF)
 18:13:35.615856 B > A: . ack 1496501 win 64680 (DF)
 18:13:35.615952 A > B: . 1524241:1525701(1460) ack 1 win 17248 (DF)
 18:13:35.615966 B > A: . ack 1499421 win 64680 (DF)
 18:13:35.616088 A > B: . 1525701:1527161(1460) ack 1 win 17248 (DF)
 18:13:35.616105 B > A: . ack 1502341 win 64680 (DF)
 18:13:35.616211 A > B: . 1527161:1528621(1460) ack 1 win 17248 (DF)
 18:13:35.616228 B > A: . ack 1505261 win 64680 (DF)
 18:13:35.616327 A > B: . 1528621:1530081(1460) ack 1 win 17248 (DF)
 18:13:35.616349 B > A: . ack 1508181 win 64680 (DF)
 18:13:35.616448 A > B: . 1530081:1531541(1460) ack 1 win 17248 (DF)
 18:13:35.616565 A > B: . 1531541:1533001(1460) ack 1 win 17248 (DF)
 18:13:35.616891 A > B: . 1533001:1534461(1460) ack 1 win 17248 (DF)

Lahey Informational [Page 8] RFC 2923 TCP Problems with Path MTU Discovery September 2000

 In this trace, an ACK is generated for every two segments that
 arrive.  (The segment size is slightly larger in this trace, even
 though the source hosts are the same, because of the lack of
 timestamp options in this trace.)
 How to detect
 This condition can be observed in a packet trace when the advertised
 MSS is significantly larger than the actual PMTU of a connection.
 How to fix Several solutions for this problem have been proposed:
 A simple solution is to ACK every other packet, regardless of size.
 This has the drawback of generating large numbers of ACKs in the face
 of lots of very small packets;  this shows up with applications like
 the X Window System.
 A slightly more complex solution would monitor the size of incoming
 segments and try to determine what segment size the sender is using.
 This requires slightly more state in the receiver, but has the
 advantage of making receiver silly window syndrome avoidance
 computations more accurate [RFC813].

2.3.

 Name of Problem
 Determining MSS from PMTU
 Classification
 Performance
 Description
 The MSS advertised at the start of a connection should be based on
 the MTU of the interfaces on the system.  (For efficiency and other
 reasons this may not be the largest MSS possible.)  Some systems use
 PMTUD determined values to determine the MSS to advertise.
 This results in an advertised MSS that is smaller than the largest
 MTU the system can receive.
 Significance
 The advertised MSS is an indication to the remote system about the
 largest TCP segment that can be received [RFC879].  If this value is
 too small, the remote system will be forced to use a smaller segment
 size when sending, purely because the local system found a particular
 PMTU earlier.

Lahey Informational [Page 9] RFC 2923 TCP Problems with Path MTU Discovery September 2000

 Given the asymmetric nature of many routes on the Internet
 [Paxson97], it seems entirely possible that the return PMTU is
 different from the sending PMTU.  Limiting the segment size in this
 way can reduce performance and frustrate the PMTUD algorithm.
 Even if the route was symmetric, setting this artificially lowered
 limit on segment size will make it impossible to probe later to
 determine if the PMTU has changed.
 Implications
 The whole point of PMTUD is to send as large a segment as possible.
 If long-running connections cannot successfully probe for larger
 PMTU, then potential performance gains will be impossible to realize.
 This destroys the whole point of PMTUD.
 Relevant RFCs RFC 1191.  [RFC879] provides a complete discussion of
 MSS calculations and appropriate values.  Note that this practice
 does not violate any of the specifications in these RFCs.
 Trace file demonstrating it
 This trace was made using tcpdump running on an intermediate host.
 Host A initiates two separate consecutive connections, A1 and A2, to
 host B.  Router C is the location of the MTU bottleneck.  As usual,
 TCP options are removed from all non-SYN packets.
 22:33:32.305912 A1 > B: S 1523306220:1523306220(0)
      win 8760 <mss 1460> (DF)
 22:33:32.306518 B > A1: S 729966260:729966260(0)
      ack 1523306221 win 16384 <mss 65240>
 22:33:32.310307 A1 > B: . ack 1 win 8760 (DF)
 22:33:32.323496 A1 > B: P 1:1461(1460) ack 1 win 8760 (DF)
 22:33:32.323569 C > A1: icmp: 129.99.238.5 unreachable -
      need to frag (mtu 1024) (DF) (ttl 255, id 20666)
 22:33:32.783694 A1 > B: . 1:985(984) ack 1 win 8856 (DF)
 22:33:32.840817 B > A1: . ack 985 win 16384
 22:33:32.845651 A1 > B: . 1461:2445(984) ack 1 win 8856 (DF)
 22:33:32.846094 B > A1: . ack 985 win 16384
 22:33:33.724392 A1 > B: . 985:1969(984) ack 1 win 8856 (DF)
 22:33:33.724893 B > A1: . ack 2445 win 14924
 22:33:33.728591 A1 > B: . 2445:2921(476) ack 1 win 8856 (DF)
 22:33:33.729161 A1 > B: . ack 1 win 8856 (DF)
 22:33:33.840758 B > A1: . ack 2921 win 16384
 [...]
 22:33:34.238659 A1 > B: F 7301:8193(892) ack 1 win 8856 (DF)
 22:33:34.239036 B > A1: . ack 8194 win 15492
 22:33:34.239303 B > A1: F 1:1(0) ack 8194 win 16384

Lahey Informational [Page 10] RFC 2923 TCP Problems with Path MTU Discovery September 2000

 22:33:34.242971 A1 > B: . ack 2 win 8856 (DF)
 22:33:34.454218 A2 > B: S 1523591299:1523591299(0)
      win 8856 <mss 984> (DF)
 22:33:34.454617 B > A2: S 732408874:732408874(0)
      ack 1523591300 win 16384 <mss 65240>
 22:33:34.457516 A2 > B: . ack 1 win 8856 (DF)
 22:33:34.470683 A2 > B: P 1:985(984) ack 1 win 8856 (DF)
 22:33:34.471144 B > A2: . ack 985 win 16384
 22:33:34.476554 A2 > B: . 985:1969(984) ack 1 win 8856 (DF)
 22:33:34.477580 A2 > B: P 1969:2953(984) ack 1 win 8856 (DF)
 [...]
 Notice that the SYN packet for session A2 specifies an MSS of 984.
 Trace file demonstrating correct behavior
 As before, this trace was made using tcpdump running on an
 intermediate host.  Host A initiates two separate consecutive
 connections, A1 and A2, to host B.  Router C is the location of the
 MTU bottleneck.  As usual, TCP options are removed from all non-SYN
 packets.
 22:36:58.828602 A1 > B: S 3402991286:3402991286(0) win 32768
      <mss 4312,wscale 0,nop,timestamp 1123370309 0,
       echo 1123370309> (DF)
 22:36:58.844040 B > A1: S 946999880:946999880(0)
      ack 3402991287 win 16384
      <mss 65240,nop,wscale 0,nop,nop,timestamp 429552 1123370309>
 22:36:58.848058 A1 > B: . ack 1 win 32768  (DF)
 22:36:58.851514 A1 > B: P 1:1025(1024) ack 1 win 32768  (DF)
 22:36:58.851584 C > A1: icmp: 129.99.238.5 unreachable -
      need to frag (mtu 1024) (DF)
 22:36:58.855885 A1 > B: . 1:969(968) ack 1 win 32768  (DF)
 22:36:58.856378 A1 > B: . 969:985(16) ack 1 win 32768  (DF)
 22:36:59.036309 B > A1: . ack 985 win 16384
 22:36:59.039255 A1 > B: FP 985:1025(40) ack 1 win 32768  (DF)
 22:36:59.039623 B > A1: . ack 1026 win 16344
 22:36:59.039828 B > A1: F 1:1(0) ack 1026 win 16384
 22:36:59.043037 A1 > B: . ack 2 win 32768  (DF)
 22:37:01.436032 A2 > B: S 3404812097:3404812097(0) win 32768
      <mss 4312,wscale 0,nop,timestamp 1123372916 0,
       echo 1123372916> (DF)
 22:37:01.436424 B > A2: S 949814769:949814769(0)
      ack 3404812098 win 16384
      <mss 65240,nop,wscale 0,nop,nop,timestamp 429562 1123372916>
 22:37:01.440147 A2 > B: . ack 1 win 32768  (DF)
 22:37:01.442736 A2 > B: . 1:969(968) ack 1 win 32768  (DF)

Lahey Informational [Page 11] RFC 2923 TCP Problems with Path MTU Discovery September 2000

 22:37:01.442894 A2 > B: P 969:985(16) ack 1 win 32768  (DF)
 22:37:01.443283 B > A2: . ack 985 win 16384
 22:37:01.446068 A2 > B: P 985:1025(40) ack 1 win 32768  (DF)
 22:37:01.446519 B > A2: . ack 1025 win 16384
 22:37:01.448465 A2 > B: F 1025:1025(0) ack 1 win 32768  (DF)
 22:37:01.448837 B > A2: . ack 1026 win 16384
 22:37:01.449007 B > A2: F 1:1(0) ack 1026 win 16384
 22:37:01.452201 A2 > B: . ack 2 win 32768  (DF)
 Note that the same MSS was used for both session A1 and session A2.
 How to detect
 This can be detected using a packet trace of two separate
 connections;  the first should invoke PMTUD; the second should start
 soon enough after the first that the PMTU value does not time out.
 How to fix
 The MSS should be determined based on the MTUs of the interfaces on
 the system, as outlined in [RFC1122] and [RFC1191].

3. Security Considerations

 The one security concern raised by this memo is that ICMP black holes
 are often caused by over-zealous security administrators who block
 all ICMP messages.  It is vitally important that those who design and
 deploy security systems understand the impact of strict filtering on
 upper-layer protocols.  The safest web site in the world is worthless
 if most TCP implementations cannot transfer data from it.  It would
 be far nicer to have all of the black holes fixed rather than fixing
 all of the TCP implementations.

4. Acknowledgements

 Thanks to Mark Allman, Vern Paxson, and Jamshid Mahdavi for generous
 help reviewing the document, and to Matt Mathis for early suggestions
 of various mechanisms that can cause PMTUD black holes, as well as
 review.  The structure for describing TCP problems, and the early
 description of that structure is from [RFC2525].  Special thanks to
 Amy Bock, who helped perform the PMTUD tests which discovered these
 bugs.

Lahey Informational [Page 12] RFC 2923 TCP Problems with Path MTU Discovery September 2000

5. References

 [RFC2581]    Allman, M., Paxson, V. and W. Stevens, "TCP Congestion
              Control", RFC 2581, April 1999.
 [RFC1122]    Braden, R., "Requirements for Internet Hosts --
              Communication Layers", STD 3, RFC 1122, October 1989.
 [RFC813]     Clark, D., "Window and Acknowledgement Strategy in TCP",
              RFC 813, July 1982.
 [Jacobson89] V. Jacobson, C. Leres, and S. McCanne, tcpdump, June
              1989, ftp.ee.lbl.gov
 [RFC1435]    Knowles, S., "IESG Advice from Experience with Path MTU
              Discovery", RFC 1435, March 1993.
 [RFC1191]    Mogul, J. and S. Deering, "Path MTU discovery", RFC
              1191, November 1990.
 [RFC1981]    McCann, J., Deering, S. and J. Mogul, "Path MTU
              Discovery for IP version 6", RFC 1981, August 1996.
 [Paxson96]   V. Paxson, "End-to-End Routing Behavior in the
              Internet", IEEE/ACM Transactions on Networking (5),
              pp.~601-615, Oct. 1997.
 [RFC2525]    Paxon, V., Allman, M., Dawson, S., Fenner, W., Griner,
              J., Heavens, I., Lahey, K., Semke, I. and B. Volz,
              "Known TCP Implementation Problems", RFC 2525, March
              1999.
 [RFC879]     Postel, J., "The TCP Maximum Segment Size and Related
              Topics", RFC 879, November 1983.
 [RFC2001]    Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast
              Retransmit, and Fast Recovery Algorithms", RFC 2001,
              January 1997.

Lahey Informational [Page 13] RFC 2923 TCP Problems with Path MTU Discovery September 2000

6. Author's Address

 Kevin Lahey
 dotRocket, Inc.
 1901 S. Bascom Ave., Suite 300
 Campbell, CA 95008
 USA
 Phone:  +1 408-371-8977 x115
 email:  kml@dotrocket.com

Lahey Informational [Page 14] RFC 2923 TCP Problems with Path MTU Discovery September 2000

7. Full Copyright Statement

 Copyright (C) The Internet Society (2000).  All Rights Reserved.
 This document and translations of it may be copied and furnished to
 others, and derivative works that comment on or otherwise explain it
 or assist in its implementation may be prepared, copied, published
 and distributed, in whole or in part, without restriction of any
 kind, provided that the above copyright notice and this paragraph are
 included on all such copies and derivative works.  However, this
 document itself may not be modified in any way, such as by removing
 the copyright notice or references to the Internet Society or other
 Internet organizations, except as needed for the purpose of
 developing Internet standards in which case the procedures for
 copyrights defined in the Internet Standards process must be
 followed, or as required to translate it into languages other than
 English.
 The limited permissions granted above are perpetual and will not be
 revoked by the Internet Society or its successors or assigns.
 This document and the information contained herein is provided on an
 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Acknowledgement

 Funding for the RFC Editor function is currently provided by the
 Internet Society.

Lahey Informational [Page 15]

/data/webs/external/dokuwiki/data/pages/rfc/rfc2923.txt · Last modified: 2000/09/14 15:46 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki