GENWiki

Premier IT Outsourcing and Support Services within the UK

User Tools

Site Tools


rfc:rfc8775



Internet Engineering Task Force (IETF) Y. Cai Request for Comments: 8775 H. Ou Category: Standards Track Alibaba Group ISSN: 2070-1721 S. Vallepalli

                                                                      
                                                             M. Mishra
                                                             S. Venaas
                                                   Cisco Systems, Inc.
                                                              A. Green
                                                       British Telecom
                                                            April 2020
                PIM Designated Router Load Balancing

Abstract

 On a multi-access network, one of the PIM-SM (PIM Sparse Mode)
 routers is elected as a Designated Router.  One of the
 responsibilities of the Designated Router is to track local multicast
 listeners and forward data to these listeners if the group is
 operating in PIM-SM.  This document specifies a modification to the
 PIM-SM protocol that allows more than one of the PIM-SM routers to
 take on this responsibility so that the forwarding load can be
 distributed among multiple routers.

Status of This Memo

 This is an Internet Standards Track document.
 This document is a product of the Internet Engineering Task Force
 (IETF).  It represents the consensus of the IETF community.  It has
 received public review and has been approved for publication by the
 Internet Engineering Steering Group (IESG).  Further information on
 Internet Standards is available in Section 2 of RFC 7841.
 Information about the current status of this document, any errata,
 and how to provide feedback on it may be obtained at
 https://www.rfc-editor.org/info/rfc8775.

Copyright Notice

 Copyright (c) 2020 IETF Trust and the persons identified as the
 document authors.  All rights reserved.
 This document is subject to BCP 78 and the IETF Trust's Legal
 Provisions Relating to IETF Documents
 (https://trustee.ietf.org/license-info) in effect on the date of
 publication of this document.  Please review these documents
 carefully, as they describe your rights and restrictions with respect
 to this document.  Code Components extracted from this document must
 include Simplified BSD License text as described in Section 4.e of
 the Trust Legal Provisions and are provided without warranty as
 described in the Simplified BSD License.

Table of Contents

 1.  Introduction
 2.  Terminology
 3.  Applicability
 4.  Functional Overview
   4.1.  GDR Candidates
 5.  Protocol Specification
   5.1.  Hash Mask and Hash Algorithm
   5.2.  Modulo Hash Algorithm
     5.2.1.  Modulo Hash Algorithm Examples
     5.2.2.  Limitations
   5.3.  PIM Hello Options
     5.3.1.  PIM DR Load-Balancing Capability (DRLB-Cap) Hello
             Option
     5.3.2.  PIM DR Load-Balancing List (DRLB-List) Hello Option
   5.4.  PIM DR Operation
   5.5.  PIM GDR Candidate Operation
   5.6.  DRLB-List Hello Option Processing
   5.7.  PIM Assert Modification
   5.8.  Backward Compatibility
 6.  Operational Considerations
 7.  IANA Considerations
   7.1.  Initial Registry
   7.2.  Assignment of New Hash Algorithms
 8.  Security Considerations
 9.  References
   9.1.  Normative References
   9.2.  Informative References
 Acknowledgements
 Authors' Addresses

1. Introduction

 On a multi-access LAN (such as an Ethernet) with one or more PIM-SM
 (PIM Sparse Mode) [RFC7761] routers, one of the PIM-SM routers is
 elected as a Designated Router (DR).  The PIM DR has two
 responsibilities in the PIM-SM protocol.  For any active sources on a
 LAN, the PIM DR is responsible for registering with the Rendezvous
 Point (RP) if the group is operating in PIM-SM.  Also, the PIM DR is
 responsible for tracking local multicast listeners and forwarding
 data to these listeners if the group is operating in PIM-SM.
 Consider the following LAN in Figure 1:
                           (core networks)
                            |     |     |
                            |     |     |
                           R1    R2     R3
                            |     |     |
                            ----(LAN)----
                                  |
                                  |
                          (many receivers)
                      Figure 1: LAN with Receivers
 Assume R1 is elected as the DR.  According to the PIM-SM protocol, R1
 will be responsible for forwarding traffic to that LAN on behalf of
 all local members.  In addition to keeping track of membership
 reports, R1 is also responsible for initiating the creation of source
 and/or shared trees towards the senders or the RPs.  The membership
 reports would be IGMP or Multicast Listener Discovery (MLD) messages.
 This applies to any versions of the IGMP and MLD protocols.  The most
 recent versions are IGMPv3 [RFC3376] and MLDv2 [RFC3810].
 Having a single router acting as DR and being responsible for data-
 plane forwarding leads to several issues.  One of the issues is that
 the aggregated bandwidth will be limited to what R1 can handle with
 regards to capacity of incoming links, the interface on the LAN, and
 total forwarding capacity.  It is very common that a LAN consists of
 switches that run IGMP/MLD or PIM snooping [RFC4541].  This allows
 the forwarding of multicast packets to be restricted only to segments
 leading to receivers that have indicated their interest in multicast
 groups using either IGMP or MLD.  The emergence of the switched
 Ethernet allows the aggregated bandwidth to exceed, sometimes by a
 large number, that of a single link.  For example, let us modify
 Figure 1 and introduce an Ethernet switch in Figure 2.
                          (core networks)
                           |     |     |
                           |     |     |
                          R1    R2     R3
                           |     |     |
                        +=gi1===gi2===gi3=+
                        +                 +
                        +      switch     +
                        +                 +
                        +=gi4===gi5===gi6=+
                           |     |     |
                          H1    H2     H3
                   Figure 2: LAN with Ethernet Switch
 Let us assume that each individual link is a Gigabit Ethernet.  Each
 router (R1, R2, and R3) and the switch have enough forwarding
 capacity to handle hundreds of gigabits of data.
 Let us further assume that each of the hosts requests 500 Mbps of
 unique multicast data.  This totals to 1.5 Gbps of data, which is
 less than what each switch or the combined uplink bandwidth across
 the routers can handle, even under failure of a single router.
 On the other hand, the link between R1 and switch, via port gi1, can
 only handle a throughput of 1 Gbps.  And if R1 is the only DR (the
 PIM DR elected using the procedure defined by [RFC7761]), at least
 500 Mbps worth of data will be lost because the only link that can be
 used to draw the traffic from the routers to the switch is via gi1.
 In other words, the entire network's throughput is limited by the
 single connection between the PIM DR and the switch (or LAN, as in
 Figure 1).
 Another important issue is related to failover.  If R1 is the only
 forwarder on a shared LAN, when R1 goes out of service, multicast
 forwarding for the entire LAN has to be rebuilt by the newly elected
 PIM DR.  However, if there were a way that allowed multiple routers
 to forward to the LAN for different groups, failure of one of the
 routers would only lead to disruption to a subset of the flows,
 therefore improving the overall resilience of the network.
 This document specifies a modification to the PIM-SM protocol that
 allows more than one of these routers, called Group Designated
 Routers (GDRs), to be selected so that the forwarding load can be
 distributed among a number of routers.

2. Terminology

 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
 "OPTIONAL" in this document are to be interpreted as described in
 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
 capitals, as shown here.
 With respect to PIM-SM, this document follows the terminology that
 has been defined in [RFC7761].
 This document also introduces the following new acronyms:
 GDR: Group Designated Router.  For each multicast flow, either a
    (*,G) for Any-Source Multicast (ASM) or an (S,G) for Source-
    Specific Multicast (SSM) [RFC4607], a hash algorithm (described
    below) is used to select one of the routers as a GDR.  The GDR is
    responsible for initiating the forwarding tree building process
    for the corresponding multicast flow.
 GDR Candidate:  a router that has the potential to become a GDR.
    There might be multiple GDR Candidates on a LAN, but only one can
    become the GDR for a specific multicast flow.

3. Applicability

 The extension specified in this document applies to PIM-SM routers
 acting as last-hop routers (there are directly connected receivers).
 It does not alter the behavior of a PIM DR or any other routers on
 the first-hop network (directly connected sources).  This is because
 the source tree is built using the IP address of the sender, not the
 IP address of the PIM DR that sends PIM registers towards the RP.
 The load balancing between first-hop routers can be achieved
 naturally if an IGP provides equal cost multiple paths (which it
 usually does in practice).  Also, distributing the load to do source
 registration does not justify the additional complexity required to
 support it.

4. Functional Overview

 In the PIM DR election as defined in [RFC7761], when multiple routers
 are connected to a multi-access LAN (for example, an Ethernet), one
 of them is elected to act as PIM DR.  The PIM DR is responsible for
 sending local Join/Prune messages towards the RP or source.  In order
 to elect the PIM DR, each PIM router on the LAN examines the received
 PIM Hello messages and compares its own DR priority and IP address
 with those of its neighbors.  The router with the highest DR priority
 is the PIM DR.  If there are multiple such routers, their IP
 addresses are used as the tiebreaker, as described in [RFC7761].
 In order to share forwarding load among last-hop routers, besides the
 normal PIM DR election, one or more GDRs are elected on the multi-
 access LAN.  There is only one PIM DR on the multi-access LAN, but
 there might be multiple GDR Candidates.
 For each multicast flow, that is, (*,G) for ASM and (S,G) for SSM, a
 hash algorithm (Section 5.1) is used to select one of the routers to
 be the GDR.  The new DR Load-Balancing Capability (DRLB-Cap) PIM
 Hello Option is used to announce the Capability, as well as the hash
 algorithm type.  Routers with the new DRLB-Cap Option advertised in
 their PIM Hello, using the same GDR election hash algorithm and the
 same DR priority as the PIM DR, are considered as GDR Candidates.
 Hash masks are defined for Source, Group, and RP, separately, in
 order to handle PIM ASM/SSM.  The masks, as well as a sorted list of
 GDR Candidate addresses, are announced by the DR in a new DR Load-
 Balancing List (DRLB-List) PIM Hello Option.
 A hash algorithm based on the announced Source, Group, or RP masks
 allows one GDR to be assigned to a corresponding multicast state.
 That GDR is responsible for initiating the creation of the multicast
 forwarding tree for multicast traffic.

4.1. GDR Candidates

 GDR is the new concept introduced by this specification.  GDR
 Candidates are routers eligible for GDR election on the LAN.  To
 become a GDR Candidate, a router must have the same DR priority and
 run the same GDR election hash algorithm as the DR on the LAN.
 For example, assume there are 4 routers on the LAN: R1, R2, R3, and
 R4, each announcing a DRLB-Cap Option.  R1, R2, and R3 have the same
 DR priority, while R4's DR priority is less preferred.  In this
 example, R4 will not be eligible for GDR election, because R4 will
 not become a PIM DR unless all of R1, R2, and R3 go out of service.
 Furthermore, assume router R1 wins the PIM DR election, R1 and R2
 advertise the same hash algorithm for GDR election, while R3
 advertises a different one.  In this case, only R1 and R2 will be
 eligible for GDR election, while R3 will not.
 As a DR, R1 will include its own Load-Balancing Hash Masks and the
 identity of R1 and R2 (the GDR Candidates) in its DRLB-List Hello
 Option.

5. Protocol Specification

5.1. Hash Mask and Hash Algorithm

 A hash mask is used to extract a number of bits from the
 corresponding IP address field (32 for IPv4, 128 for IPv6) and
 calculate a hash value.  A hash value is used to select a GDR from
 GDR Candidates advertised by the PIM DR.  Hash masks allow for
 certain flows to always be forwarded by the same GDR, by ignoring
 certain bits in the hash value calculation, so that the hash values
 are the same.  For example, 0.0.255.0 defines a hash mask for an IPv4
 address that masks the first, second, and fourth octets, which means
 that only the third octet will influence the hash value computed.
 Note that the masks need not be a contiguous set of bits.  For
 example, for IPv4, 15.15.15.15 would be a valid mask.
 In the text below, a hash mask is, in some places, said to be zero.
 A hash mask is zero if no bits are set, that is, 0.0.0.0 for IPv4 and
 :: for IPv6.  Also, a hash mask is said to be an all-bits-set mask if
 it is 255.255.255.255 for IPv4 or
 ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff for IPv6.
 There are three hash masks defined:
  • RP Hash Mask
  • Source Hash Mask
  • Group Hash Mask
 The hash masks need to be configured on the PIM routers that can
 potentially become a PIM DR, unless the implementation provides
 default hash mask values.  An implementation SHOULD have default hash
 mask values as follows.  The default RP Hash Mask SHOULD be zero (no
 bits set).  The default Source and Group Hash Masks SHOULD both be
 all-bits-set masks.  These default values are likely acceptable for
 most deployments and simplify configuration.  There is only a need to
 use other masks if one needs to ensure that certain flows are
 forwarded by the same GDR.
 The DRLB-List Hello Option contains a list of GDR Candidates.  The
 first one listed has ordinal number 0, the second listed ordinal
 number 1, and the last one has ordinal number N - 1 if there are N
 candidates listed.  The hash value computed will be the ordinal
 number of the GDR Candidate that is acting as GDR for the flow in
 question.
 The input to be hashed is determined as follows:
  • If the group is in ASM mode and the RP Hash Mask announced by the

PIM DR is not zero (at least one bit is set), calculate the value

    of hashvalue_RP (Section 5.2) to determine the GDR.
  • If the group is in ASM mode and the RP Hash Mask announced by the

PIM DR is zero (no bits are set), obtain the value of

    hashvalue_Group (Section 5.2) to determine the GDR.
  • If the group is in SSM mode, use hashvalue_SG (Section 5.2) to

determine the GDR.

 A simple modulo hash algorithm is defined in this document.  However,
 to allow another hash algorithm to be used, a 1-octet "Hash
 Algorithm" field is included in the DRLB-Cap Hello Option to specify
 the hash algorithm used by the router.
 If different hash algorithms are advertised among the routers on a
 LAN, only the routers advertising the same hash algorithm as the DR
 (as well as having the same DR priority as the DR) are eligible for
 GDR election.

5.2. Modulo Hash Algorithm

 As part of computing the hash, the notation LSZC(hash_mask) is used
 to denote the number of zeroes counted from the least significant bit
 of a hash mask hash_mask.  As an example, LSZC(255.255.128) is 7 and
 LSZC(ffff:8000::) is 111.  If all bits are set, LSZC will be 0.  If
 the mask is zero, then LSZC will be 32 for IPv4 and 128 for IPv6.
 The number of GDR Candidates is denoted as GDRC.
 The idea behind the modulo hash algorithm is, in simple terms, that
 the corresponding mask is applied to a value, then the result is
 shifted right LSZC(mask) bits so that the least significant bits that
 were masked out are not considered.  Then, this result is masked by
 0xffffffff, keeping only the last 32 bits of the result (this only
 makes a difference for IPv6).  Finally, the hash value is this result
 modulo the number of GDR Candidates (GDRC).
 The modulo hash algorithm, for computing the values hashvalue_RP,
 hashvalue_Group, and hashvalue_SG, is defined as follows.
 hashvalue_RP is calculated as:
    (((RP_address & RP_mask) >> LSZC(RP_mask)) & 0xffffffff) % GDRC
    RP_address is the address of the RP defined for the group, and
    RP_mask is the RP Hash Mask.
 hashvalue_Group is calculated as:
    (((Group_address & Group_mask) >> LSZC(Group_mask)) & 0xffffffff)
    % GDRC
    Group_address is the group address, and Group_mask is the Group
    Hash Mask.
 hashvalue_SG is calculated as:
    ((((Source_address & Source_mask) >> LSZC(Source_mask)) &
    0xffffffff) ^ (((Group_address & Group_mask) >> LSZC(Group_mask))
    & 0xffffffff)) % GDRC
    Group_address is the group address, and Group_mask is the Group
    Hash Mask.

5.2.1. Modulo Hash Algorithm Examples

 To help illustrate the algorithm, consider this example.  Router X
 with IPv4 address 203.0.113.1 receives a DRLB-List Hello Option from
 the DR that announces RP Hash Mask 0.0.255.0 and a list of GDR
 Candidates, sorted by IP addresses from high to low: 203.0.113.3,
 203.0.113.2, and 203.0.113.1.  The ordinal number assigned to those
 addresses would be:
 0 for 203.0.113.3; 1 for 203.0.113.2; 2 for 203.0.113.1 (Router X).
 Assume there are 2 RPs: RP1 192.0.2.1 for Group1 and RP2 198.51.100.2
 for Group2.  Following the modulo hash algorithm:
  • LSZC(0.0.255.0) is 8, and GDRC is 3. The hashvalue_RP for Group1

with RP RP1 is:

    (((192.0.2.1 & 0.0.255.0) >> 8) & 0xffffffff % 3)
    = 2 % 3
    = 2
    This matches the ordinal number assigned to Router X.  Router X
    will be the GDR for Group1.
  • The hashvalue_RP for Group2 with RP RP2 is:
    (((198.51.100.2 & 0.0.255.0) >> 8) & 0xffffffff % 3)
    = 100 % 3
    = 1
    This is different from the ordinal number of Router X (2).  Hence,
    Router X will not be GDR for Group2.
 For IPv6, consider this example, similar to the above.  Router X with
 IPv6 address fe80::1 receives a DRLB-List Hello Option from the DR
 that announces RP Hash Mask ::ffff:ffff:ffff:0 and a list of GDR
 Candidates, sorted by IP addresses from high to low: fe80::3,
 fe80::2, and fe80::1.  The ordinal number assigned to those addresses
 would be:
    0 for fe80::3; 1 for fe80::2; 2 for fe80::1 (Router X).
 Assume there are 2 RPs: RP1 2001:db8::1:0:5678:1 for Group1 and RP2
 2001:db8::1:0:1234:2 for Group2.  Following the modulo hash
 algorithm:
  • LSZC(::ffff:ffff:ffff:0) is 16, and GDRC is 3. The hashvalue_RP

for Group1 with RP RP1 is:

    (((2001:db8::1:0:5678:1 & ::ffff:ffff:ffff:0) >> 16) &
     0xffffffff % 3)
    = ((::1:0:5678:0 >> 16) & 0xffffffff % 3)
    = (::1:0:5678 & 0xffffffff % 3)
    = ::5678 % 3
    = 2
    This matches the ordinal number assigned to Router X.  Router X
    will be the GDR for Group1.
  • The hashvalue_RP for Group2 with RP RP2 is:
    (((2001:db8::1:0:1234:1 & ::ffff:ffff:ffff:0) >> 16) &
     0xffffffff % 3)
    = ((::1:0:1234:0 >> 16) & 0xffffffff % 3)
    = (::1:0:1234 & 0xffffffff % 3)
    = ::1234 % 3
    = 1
    This is different from the ordinal number of Router X (2).  Hence,
    Router X will not be GDR for Group2.

5.2.2. Limitations

 The modulo hash algorithm has poor failover characteristics when a
 shared LAN has more than two GDRs.  In the case of more than two GDRs
 on a LAN, when one GDR fails, all of the groups may be reassigned to
 a different GDR, even if they were not assigned to the failed GDR.
 However, many deployments use only two routers on a shared LAN for
 redundancy purposes.  Future work may define new hash algorithms
 where only groups assigned to the failed GDR get reassigned.
 The modulo hash algorithm will use, at most, 32 consecutive bits of
 the input addresses for its computation.  Exactly which bits are used
 of the source, group, or RP addresses depend on the respective masks.
 This limitation may be an issue for IPv6 deployments, since not all
 bits of the IPv6 addresses are considered.  If this causes
 operational issues, a new hash algorithm would need to be defined.

5.3. PIM Hello Options

 PIM routers include a new option, called "Load-Balancing Capability
 (DRLB-Cap)", in their PIM Hello messages.
 Besides this DRLB-Cap Hello Option, the elected PIM DR also includes
 a new "DR Load-Balancing List (DRLB-List) Hello Option".  The DRLB-
 List Hello Option consists of three hash masks, as defined above, and
 also a list of GDR Candidate addresses on the LAN.  It is recommended
 that the GDR Candidate addresses are sorted in descending order.
 This ensures that when using algorithms, such as the modulo hash
 algorithm in this document, that it is predictable which GDR is
 responsible for which groups, regardless of the order the DR learned
 about the candidates.

5.3.1. PIM DR Load-Balancing Capability (DRLB-Cap) Hello Option

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Type = 34           |         Length = 4            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     Reserved                  |Hash Algorithm |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        Figure 3: PIM DR Load-Balancing Capability Hello Option
 Type:  34
 Length:  4
 Reserved:  Transmitted as zero, ignored on receipt.
 Hash Algorithm:  Hash algorithm type.  A value listed in the IANA
    "PIM Designated Router Load-Balancing Hash Algorithms" registry. 0
    is used for the hash algorithm defined in this document.
 This DRLB-Cap Hello Option MUST be advertised by routers on all
 interfaces where DR Load Balancing is enabled.  Note that the option
 is included, at most, once.

5.3.2. PIM DR Load-Balancing List (DRLB-List) Hello Option

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Type = 35           |         Length                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          Group Mask                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          Source Mask                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                            RP Mask                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    GDR Candidate Address(es)                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
           Figure 4: PIM DR Load-Balancing List Hello Option
 Type:  35
 Length:  (3 + n) x (4 or 16) bytes, where n is the number of GDR
    Candidates.
 Group Mask (32/128 bits):  Mask applied to group addresses as part of
    hash computation.
 Source Mask (32/128 bits):  Mask applied to source addresses as part
    of hash computation.
 RP Mask (32/128 bits):  Mask applied to RP addresses as part of hash
    computation.
 All masks MUST have the same number of bits as the IP source address
 in the PIM Hello IP header.
 GDR Candidate Address(es) (32/128 bits):  List of GDR Candidate(s)
    All addresses MUST be in the same address family as the PIM Hello
    IP header.  It is recommended that the addresses are sorted in
    descending order.
    If the "Interface ID" option, as specified in [RFC6395], is
    present in a GDR Candidate's PIM Hello message and the "Router
    Identifier" portion is non-zero:
  • For IPv4, the "GDR Candidate Address" will be set directly to

the "Router Identifier".

  • For IPv6, the "GDR Candidate Address" will be 96 bits of

zeroes, followed by the 32 bit Router Identifier.

    If the "Interface ID" option is not present in a GDR Candidate's
    PIM Hello message or if the "Interface ID" option is present but
    the "Router Identifier" field is zero, the "GDR Candidate Address"
    will be the IPv4 or IPv6 source address of the PIM Hello message.
    This DRLB-List Hello Option MUST only be advertised by the elected
    PIM DR.  It MUST be ignored if received from a non-DR.  The option
    MUST also be ignored if the hash masks are not the correct number
    of bits or GDR Candidate addresses are in the wrong address
    family.

5.4. PIM DR Operation

 The DR election process is still the same as defined in [RFC7761].
 The DR advertises the new DRLB-List Hello Option, which contains mask
 values from user configuration (or default values), followed by a
 list of GDR Candidate addresses.  Note that if a router included the
 "Interface ID" option in the hello message and the Router ID is non-
 zero, the Router ID will be used to form the GDR Candidate address of
 the router, as discussed in the previous section.  It is recommended
 that the list be sorted from the highest value to the lowest value.
 The reason for sorting the list is to make the behavior
 deterministic, regardless of the order in which the DR learns of new
 candidates.  Note that, as for non-DR routers, the DR also advertises
 the DRLB-Cap Hello Option to indicate its ability to support the new
 functionality and the type of GDR election hash algorithm it uses.
 If a PIM DR receives a neighbor DRLB-Cap Hello Option that contains
 the same hash algorithm as the DR and the neighbor has the same DR
 priority as the DR, PIM DR SHOULD consider the neighbor as a GDR
 Candidate and insert the GDR Candidate's Address into the list of the
 DRLB-List Option.  However, the DR may have policies limiting which
 or the number of GDR Candidates to include.  Likewise, the DR SHOULD
 include itself in the list of GDR Candidates, but it is permissible
 not to do so, for instance, if there is some policy restricting the
 candidate set.
 If a PIM neighbor included in the list expires, stops announcing the
 DRLB-Cap Hello Option, changes DR priority, changes hash algorithm,
 or otherwise becomes ineligible as a candidate, the DR SHOULD
 immediately send a triggered hello with a new list in the DRLB-List
 option, excluding the neighbor.
 If a new router becomes eligible as a candidate, there is no urgency
 in sending out an updated list.  An updated list SHOULD be included
 in the next hello.

5.5. PIM GDR Candidate Operation

 When an IGMP/MLD report is received, a hash algorithm is used by the
 GDR Candidates to determine which router is going to be responsible
 for building forwarding trees on behalf of the host.
 The router MUST include the DRLB-Cap Hello Option in all PIM Hello
 messages sent on the interface.  Note that the presence of the DRLB-
 Cap Option in the PIM Hello does not guarantee that the router will
 be considered as a GDR Candidate.  Once the DR election is done, the
 DRLB-List Hello Option is received from the current PIM DR containing
 a list of the selected GDR Candidates.
 A router only acts as a GDR Candidate if it is included in the GDR
 Candidate list of the DRLB-List Hello Option.  See next section for
 details.

5.6. DRLB-List Hello Option Processing

 This section discusses processing of the DRLB-List Hello Option,
 including the case where it was received in the previous hello but
 not in the current hello.  All routers MUST ignore the DRLB-List
 Hello Option if it is received from a PIM router that is not the DR.
 The option MUST only be processed by routers that are announcing the
 DRLB-Cap Option and only if the hash algorithm announced by the DR is
 the same as the local announcement.  All GDR Candidates MUST use the
 hash masks advertised in the Option, even if they differ from those
 the candidate was configured with.  The DR MUST also process its own
 DRLB-List Hello Option.
 A router stores the latest option contents that were announced, if
 any, and deletes the previous contents.  The router MUST also compare
 the new contents with any previous contents and, if there are any
 changes, continue processing as below.  Note that if the option does
 not pass the above checks, the below processing MUST be done as if
 the option was not announced.
 If the contents of the DRLB-List Option, the masks, or the candidate
 list differ from the previously saved copy, it is received for the
 first time, or it is no longer being received or accepted, the option
 MUST be processed as below.
 1.  If the local router is included in the "GDR Candidate
     Address(es)" field, it will look for its own address, or if it
     announces a non-zero Router ID, its own Router ID.  For each of
     the groups or source and group pairs, if the group is in SSM mode
     with local receiver interest, the router MUST run the hash
     algorithm to determine which of them is for the GDR.
  • If there is no change in the GDR status, then no further

action is required.

  • If the router becomes the new GDR, then a multicast forwarding

tree MUST be built [RFC7761].

  • If the router is no longer the GDR, then it uses an Assert as

explained in Section 5.7.

 2.  If one of the following occurs:
  • the local router is not included in the "GDR Candidate

Address(es)" field,

  • the DRLB-List Hello Option is no longer included in the DR's

Hello, or

  • the DR's Neighbor Liveness Timer expires [RFC7761],
     then for each group (or each source and group pair if the group
     is in SSM mode) with local receiver interest, for which the
     router is the GDR, the router uses an Assert as explained in
     Section 5.7.

5.7. PIM Assert Modification

 GDR changes may occur due to configuration change, GDR Candidates
 going down, and also new routers coming up and becoming GDR
 Candidates.  This may occur while flows are being forwarded.  If the
 GDR for an active flow changes, there is likely to be some
 disruption, such as packet loss or duplicates.  By using asserts,
 packet loss is minimized while allowing a small amount of duplicates.
 When a router stops acting as the GDR for a group, or source and
 group pair if SSM, it MUST set the Assert metric preference to
 maximum (0x7fffffff) and the Assert metric to one less than maximum
 (0xfffffffe).  That is, whenever it sends or receives an Assert for
 the group, it must use these values as the metric preference and
 metric rather than the values provided by the unicast routing
 protocol.
 The rest of this section is just for illustration purposes and not
 part of the protocol definition.
 To illustrate the behavior when there is a GDR change, consider the
 following scenario where there are two flows: G1 and G2.  R1 is the
 GDR for G1, and R2 is the GDR for G2.  When R3 comes up, it is
 possible that R3 becomes GDR for both G1 and G2; hence, R3 starts to
 build the forwarding tree for G1 and G2.  If R1 and R2 stop
 forwarding before R3 completes the process, packet loss might occur.
 On the other hand, if R1 and R2 continue forwarding while R3 is
 building the forwarding trees, duplicates might occur.
 When the role of GDR changes as above, instead of immediately
 stopping forwarding, R1 and R2 continue forwarding to G1 and G2
 respectively, while, at the same time, R3 build forwarding trees for
 G1 and G2.  This will lead to PIM Asserts.
 For G1, using the functionality described in this document, R1 and R3
 determine the new GDR, which is R3.  With the modified Assert
 behavior, R1 sets its Assert metric to the near maximum value, as
 discussed above.  That will make R3, which has normal metric in its
 Assert, the Assert winner.

5.8. Backward Compatibility

 In the case of a hybrid Ethernet shared LAN (where some PIM routers
 support the functionality defined in this document and some do not):
  • If the DR does not support the new functionality, then there will

be no load balancing.

  • If non-DR routers do not support the new functionality, they will

not be considered as GDR Candidate and will not take part in load

    balancing.  Load balancing may still happen on the link.

6. Operational Considerations

 An administrator needs to consider what the total bandwidth
 requirements are and find a set of routers that together have enough
 available capacity while making sure that each of the routers can
 handle its part, assuming that the traffic is distributed roughly
 equally among the routers.  Ideally, one should also have enough
 bandwidth to handle the case where at least one router fails.  All
 routers should have reachability to the sources and RPs, if
 applicable, that are not via the LAN.
 Care must be taken when choosing what hash masks to configure.  One
 would typically configure the same masks on all the routers so that
 they are the same, regardless of which router is elected as DR.  The
 default masks are likely suitable for most deployment.  The RP Hash
 Mask must be configured (the default is no bits set) if one wishes to
 hash based on the RP address rather than the group address for ASM.
 The default masks will use the entire group addresses, and source
 addresses if SSM, as part of the hash.  An administrator may set
 other masks that mask out part of the addresses to ensure that
 certain flows always get hashed to the same router.  How this is
 achieved depends on how the group addresses are allocated.
 Only the routers announcing the same hash algorithm as the DR would
 be considered as GDR Candidates.  Network administrators need to make
 sure that the desired set of routers announce the same algorithm.
 Migration between different algorithms is not considered in this
 document.

7. IANA Considerations

 IANA has made these assignments in the "PIM-Hello Options" registry:
 value 34 for the PIM DR Load-Balancing Capability (DRLB-Cap) Hello
 Option (with Length of 4), and value 35 for the PIM DR Load-Balancing
 List (DRLB-List) Hello Option (with variable Length).
 Per this document, IANA has created a registry called "PIM Designated
 Router Load-Balancing Hash Algorithms" in the "Protocol Independent
 Multicast (PIM)" branch of the registry tree.  The registry lists
 hash algorithms for use by PIM Designated Router Load Balancing.

7.1. Initial Registry

 The initial content of the registry is as follows.
                  +-------+------------+-----------+
                  | Type  | Name       | Reference |
                  +=======+============+===========+
                  | 0     | Modulo     | RFC 8775  |
                  +-------+------------+-----------+
                  | 1-255 | Unassigned |           |
                  +-------+------------+-----------+
                               Table 1

7.2. Assignment of New Hash Algorithms

 Assignment of new hash algorithms is done according to the "IETF
 Review" procedure; see [RFC8126].

8. Security Considerations

 Security of the new DR Load-Balancing PIM Hello Options is only
 guaranteed by the security of PIM Hello messages, so the security
 considerations for PIM Hello messages, as described in PIM-SM
 [RFC7761], apply here.
 If the DR is subverted, it could omit or add certain GDRs or announce
 an unsupported algorithm.  If another router is subverted, it could
 be made DR and cause similar issues.  While these issues are specific
 to this specification, they are not that different from existing
 attacks, such as subverting a DR and lowering the DR priority,
 causing a different router to become the DR.
 If, for any reason, the DR includes a GDR in the announced list that
 announces a different algorithm from what the DR announces, the GDR
 is required to ignore the announcement, and there will be no router
 acting as the DR for the flows that hash to that GDR.
 If a GDR is subverted, it could potentially be made to stop
 forwarding all the traffic it is expected to forward.  This is also
 similar today to if a DR is subverted.
 An administrator may be able to achieve the desired load balancing of
 known flows, but an attacker may send a single high rate flow that is
 served by a single GDR or send multiple flows that are expected to be
 hashed to the same GDR.

9. References

9.1. Normative References

 [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
            Requirement Levels", BCP 14, RFC 2119,
            DOI 10.17487/RFC2119, March 1997,
            <https://www.rfc-editor.org/info/rfc2119>.
 [RFC6395]  Gulrajani, S. and S. Venaas, "An Interface Identifier (ID)
            Hello Option for PIM", RFC 6395, DOI 10.17487/RFC6395,
            October 2011, <https://www.rfc-editor.org/info/rfc6395>.
 [RFC7761]  Fenner, B., Handley, M., Holbrook, H., Kouvelas, I.,
            Parekh, R., Zhang, Z., and L. Zheng, "Protocol Independent
            Multicast - Sparse Mode (PIM-SM): Protocol Specification
            (Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March
            2016, <https://www.rfc-editor.org/info/rfc7761>.
 [RFC8126]  Cotton, M., Leiba, B., and T. Narten, "Guidelines for
            Writing an IANA Considerations Section in RFCs", BCP 26,
            RFC 8126, DOI 10.17487/RFC8126, June 2017,
            <https://www.rfc-editor.org/info/rfc8126>.
 [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
            2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
            May 2017, <https://www.rfc-editor.org/info/rfc8174>.

9.2. Informative References

 [RFC3376]  Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A.
            Thyagarajan, "Internet Group Management Protocol, Version
            3", RFC 3376, DOI 10.17487/RFC3376, October 2002,
            <https://www.rfc-editor.org/info/rfc3376>.
 [RFC3810]  Vida, R., Ed. and L. Costa, Ed., "Multicast Listener
            Discovery Version 2 (MLDv2) for IPv6", RFC 3810,
            DOI 10.17487/RFC3810, June 2004,
            <https://www.rfc-editor.org/info/rfc3810>.
 [RFC4541]  Christensen, M., Kimball, K., and F. Solensky,
            "Considerations for Internet Group Management Protocol
            (IGMP) and Multicast Listener Discovery (MLD) Snooping
            Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006,
            <https://www.rfc-editor.org/info/rfc4541>.
 [RFC4607]  Holbrook, H. and B. Cain, "Source-Specific Multicast for
            IP", RFC 4607, DOI 10.17487/RFC4607, August 2006,
            <https://www.rfc-editor.org/info/rfc4607>.

Acknowledgements

 The authors would like to thank Steve Simlo and Taki Millonis for
 helping with the original idea; Alia Atlas, Bill Atwood, Joe Clarke,
 Alissa Cooper, Jake Holland, Bharat Joshi, Anish Kachinthaya, Anvitha
 Kachinthaya, Benjamin Kaduk, Mirja Kühlewind, Barry Leiba, Ben Niven-
 Jenkins, Alvaro Retana, Adam Roach, Michael Scharf, Éric Vyncke, and
 Carl Wallace for reviews and comments; and Toerless Eckert and
 Rishabh Parekh for helpful conversation on the document.

Authors' Addresses

 Yiqun Cai
 Alibaba Group
 520 Almanor Avenue
 Sunnyvale, CA 94085
 United States of America
 Email: yiqun.cai@alibaba-inc.com
 Heidi Ou
 Alibaba Group
 520 Almanor Avenue
 Sunnyvale, CA 94085
 United States of America
 Email: heidi.ou@alibaba-inc.com
 Sri Vallepalli
 Email: vallepal@yahoo.com
 Mankamana Mishra
 Cisco Systems, Inc.
 821 Alder Drive,
 Milpitas, CA 95035
 United States of America
 Email: mankamis@cisco.com
 Stig Venaas
 Cisco Systems, Inc.
 Tasman Drive
 San Jose, CA 95134
 United States of America
 Email: stig@cisco.com
 Andy Green
 British Telecom
 Adastral Park
 Ipswich
 IP5 2RE
 United Kingdom
 Email: andy.da.green@bt.com
/home/gen.uk/domains/wiki.gen.uk/public_html/data/pages/rfc/rfc8775.txt · Last modified: 2020/04/23 01:50 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki