Network Working Group Eric C. Rosen (Editor) Internet Draft Yiqun Cai Intended Status: Proposed Standard IJsbrand Wijnands Expires: February 24, 2010 Cisco Systems, Inc. Arjen Boers August 24, 2009 MVPN: Optimized use of PIM, Wild Card Selectors, S-PMSI Join Extensions, Bidirectional Tunnels, Extranets, Hub and Spoke draft-rosen-l3vpn-mvpn-mspmsi-05.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright and License Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Rosen, et al. [Page 1] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 Abstract Specifications for a number of important topics were arbitrarily omitted from the initial MVPN specifications, so that those specifications could be "frozen" and advanced. The current document provides some of the missing specifications. The topics covered are: (a) Extending PE-PE PIM control mechanisms to support MPLS tunnels and IPv6 flows, (b) using Wild Card selectors to bind multicast data streams to tunnels, (c) using Multipoint-to-Multipoint Label Switched Paths as tunnels, (d) binding bidirectional customer multicast data streams to specific tunnels, (e) running PIM (i.e., sending and receiving multicast control traffic) over a set of tunnels that are created only if needed to carry multicast data traffic, (f) extranets, (g) support for anycast sources, (h) support for "hub and spoke" VPNs. Rosen, et al. [Page 2] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 Table of Contents 1 Specification of requirements ......................... 4 2 Introduction .......................................... 4 2.1 Topics Covered ........................................ 4 2.2 Terminology ........................................... 6 3 S-PMSI Join Extensions ................................ 6 3.1 mLDP P2MP P-Tunnels ................................... 6 3.2 IPv6 (S,G) with GRE/IPv4 P-tunnels .................... 8 3.3 Encapsulation of S-PMSI Joins in UDP Datagrams ........ 9 4 Wild Cards: S-PMSI A-D Routes & S-PMSI Join Messages .. 9 5 Binding Wild Cards to Unidirectional P-Tunnels ........ 10 5.1 Binding (C-*,C-G) to a Unidirectional P-Tunnel ........ 11 5.2 Binding (C-*,C-*) to a Unidirectional P-Tunnel ........ 11 6 S-PMSI Procedures for Using Bidirectional P-Tunnels ... 12 6.1 Bidirectional P-Tunnels ............................... 12 6.1.1 MP2MP LSPs ............................................ 12 6.1.2 BIDIR-PIM ............................................. 13 6.2 General Procedures: MS-PMSIs .......................... 13 6.3 Use of Multiple Bidirectional P-tunnels ............... 14 6.3.1 Binding (C-S,C-G) ..................................... 14 6.3.2 Binding (C-*,C-G) Flows from Unidirectional C-trees ... 15 6.3.3 Binding (C-*,C-G) Flows from Bidirectional C-trees .... 15 6.3.4 Binding (C-*,C-*) ..................................... 16 6.3.5 Default Tunnel Identifier for MP2MP LSPs .............. 18 6.4 Single Bidirectional P-Tunnel ......................... 18 6.5 Other Methods of Instantiating an MS-PMSI ............. 19 7 PIM over MS-PMSI ...................................... 19 8 Extranets using PIM as the MVPN Control Plane ......... 21 8.1 Default PMSI .......................................... 22 8.2 Red method ............................................ 22 8.2.1 Control Plane RPF Check ............................... 23 8.2.2 Data Plane RPF Check .................................. 23 8.3 Blue method ........................................... 23 8.4 Binding Specific Extranet C-Flows to S-PMSIs .......... 24 8.5 Two VRFs on One PE .................................... 24 9 Supporting Anycast Sources with PIM Control Plane ..... 25 10 Hub and Spoke MVPNs ................................... 26 10.1 Unicast Hub and Spoke VPNs ............................ 26 10.2 Multicast Hub and Spoke VPNs .......................... 28 11 PE-PE PIM/IPv6 over IPv4 P-Tunnel ..................... 30 12 IANA Considerations ................................... 31 13 Security Considerations ............................... 31 Rosen, et al. [Page 3] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 14 Acknowledgments ....................................... 31 15 Authors' Addresses .................................... 31 16 Normative References .................................. 32 17 Informative References ................................ 33 1. Specification of requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Introduction The documents [MVPN] and [MVPN-BGP] contain specifications for a large number of MVPN topics. However, a number of important topics have been declared to be "out of scope" of those documents. This document provides the specifications for some of those topics. This document is not expected to be read as a stand-alone document; terminology from [MVPN] is used freely and knowledge of [MVPN] and [MVPN-BGP] is presupposed. Any necessary procedures not explicitly specified here are as in [MVPN] and/or [MVPN-BGP]. 2.1. Topics Covered The topics covered in this document are the following: - The use of Wild Card Selectors in S-PMSI A-D routes and S-PMSI Join Messages. As specified in [MVPN] and [MVPN-BGP], one can use an S-PMSI A-D route or an S-PMSI Join Message to assign a particular C-multicast flow, identified as (C-S,C-G), to a particular S-PMSI. The Wild Card Selectors specified in this document provide additional functionality: * One can send an S-PMSI A-D route or S-PMSI Join Message whose semantics are "assign all the traffic traveling the (C-*,C-G) tree to this S-PMSI". Rosen, et al. [Page 4] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 * One can send an S-PMSI A-D route or S-PMSI Join Message whose semantics are "use this S-PMSI as the default method for carrying any (C-S,C-G) or (C-*,C-G) traffic that isn't assigned to a different S-PMSI". That is, it allows for the use of S-PMSIs as the default PMSIs for carrying data traffic. - S-PMSI Join Extensions for IPv6 and MPLS - MS-PMSI: A new kind of PMSI instantiated by a bidirectional P-tunnel (e.g., a Multipoint-to-Multipoint Label Switched Path (MP2MP LSPs) or a BIDIR-PIM tree with GRE encapsulation). A new kind of PMSI is defined, the MS-PMSI. An S-PMSI is defined in [MVPN] to have a single PE as its transmitter. An MS-PMSI is a set of S-PMSIs with the following property: If PE1 can transmit on the MS-PMSI, and PE2 can receive on the MS-PMSI, then PE2 can transmit on the MS-PMSI and PE1 can receive on the MS-PMSI. The MS-PMSI thus has the "multidirectional" property of an MI-PMSI, but the "selective" property of an S-PMSI; transmissions on the MS-PMSI may not reach all PEs of a given VPN, but the set of PEs belonging to the MS-PMSI can use it send and receive data to/from each other. The most efficient way to instantiate an MS-PMSI is with a single bidirectional P-tunnel. This allows one to create P-tunnels which contain only a subset of the PEs attached to a given VPN, but which can be used by any member of that subset to transmit to the other members of the subset. MS-PMSIs are advertised using S-PMSI A-D routes or S-PMSI Join messages. - PIM over MS-PMSI. [MVPN] specifies how to run PIM [PIM] as the multicast routing protocol of a particular MVPN, by running it over an MI-PMSI for that MVPN. In this specification, we provide a specification for running PIM over an MS-PMSI. When PIM is run over an MI-PMSI, there may need to be P-tunnels that only carry PIM messages, but do not carry multicast data. However, when PIM is run over an MS-PMSI, there is never any need to create a P-tunnel just for control messages; the only P-tunnels needed are those which carry multicast data. - MVPN Extranets with PIM Control Plane. In an MVPN "extranet", the transmitter of a multicast traffic flow is in a different VPN than the receivers. Additional procedures are defined to determine how the traffic is associated Rosen, et al. [Page 5] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 with a particular MI-PMSI or MS-PMSI, and how the RPF checks are done. - Support for Anycast Sources, using a PIM Control Plane - Support for "Hub and Spoke" VPNs, using a PIM Control Plane - Specification for constructing an IPv6 PIM control message to be sent through a P-tunnel created by an IPv4 control plane 2.2. Terminology In the following, we will sometimes talk of a PE receiving traffic from a PMSI and then discarding it. If PIM is being used as the multicast control protocol between PEs, this always implies that the discarded traffic will not be seen by PIM on the receiving PE. In the following, we will sometimes speak of an S-PMSI A-D route being "ignored". When we say the route is "ignored", we do not mean that it's normal BGP processing is not done, but that the route is not considered when determining which P-tunnel to use when sending multicast data, and that the MPLS label values it conveys are not used. We will generally use "ignore" in quotes to indicate this meaning. 3. S-PMSI Join Extensions 3.1. mLDP P2MP P-Tunnels The S-PMSI Join message is defined in section 7.4.2.2 of [MVPN]. In this specification, we define the "type 2" and "type 3" S-PMSI Joins, which are used when the S-PMSI tunnel is a P2MP LSP created by mLDP, and the tunnel is to carry C-flows of, respectively, IPv4 or IPv6 multicast traffic. Rosen, et al. [Page 6] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | C-Source +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+....... | C-Group +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+....... | FEC Element +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+....... | Padding +-+-+-+-+-+-+-+....... Type (8 bits): - 2 if C-Source and C-Group are IPv4 addresses, - 3 if C-Source and C-Group are IPv6 addresses. Length (16 bits): the total number of octets in the Type, Length, Reserved and Value fields combined, rounded up to the next multiple of 4, encoded as an unsigned binary integer. Reserved (8 bits): This field SHOULD be zero when transmitted, and MUST be ignored when received. C-Source: address of the traffic source in the VPN - for type 2, a 32-bit IPv4 address - for type 3, a 128-bit IPv6 address C-Group: address of the traffic destination in the VPN - for type 2, a 32-bit IPv4 address - for type 3, a 128-bit IPv6 address FEC Element: this variable length field is a P2MP FEC element, encoded as a TLV as specified in [MLDP]. Padding: 0-3 bytes, as needed for 32-bit alignment. The padding bytes SHOULD be zero on transmission and MUST be ignored on reception. Rosen, et al. [Page 7] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 3.2. IPv6 (S,G) with GRE/IPv4 P-tunnels MVPN defines the S-PMSI Join type (type 1) used when assigning IPv4 (S,G) to a GRE/IPv4 P-tunnel. When assigning IPv6 (S,G) to a GRE/IPv4 P-tunnel, S-PMSI Join type 4 is used, and the C-Source and C-Group are IPv6 addresses. The P-Group address is an IPv4 address. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | C-Source | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | C-Group | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | P-Group | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type (8 bits): 4 Length (16 bits): 40 Reserved (8 bits): This field SHOULD be zero when transmitted, and MUST be ignored when received. C-Source (128 bits): the IPv6 address of the traffic source in the VPN. C-Group (128 bits): the IPv6 address of the multicast traffic destination address in the VPN. P-Group (32 bits): the IPv4 group address that the PE router is going to use to encapsulate the flow (C-Source, C-Group). Rosen, et al. [Page 8] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 3.3. Encapsulation of S-PMSI Joins in UDP Datagrams All S-PMSI Joins are encapsulated in UDP datagrams. If a UDP datagram contains a type 1 or type 2 S-PMSI Join, it MUST be sent in an IPv4 datagram. If a UDP datagram contains a type 3 or type 4 S-PMSI Join, it MUST be sent in an IPv6 datagram. If an IPv4-based protocol is being used to create the P-tunnels, then the IPv6 Source Address field of the UDP datagram SHOULD be the IPv4-mapped IPv6 address [RFC4291] that corresponds to the IPv4 address that the originating PE router uses when participating in the protocol used to build the P-tunnels. A single UDP datagram MAY carry multiple S-PMSI Join Messages, as many as can fit entirely within it. If there are multiple S-PMSI Joins in a UDP datagram, they MUST be of the same S-PMSI Join type. The end of the last S-PMSI Join (as determined by the S-PMSI Join length field) MUST coincide with the end of the UDP datagram, as determined by the UDP length field. When processing a received UDP datagram that contains one or more S-PMSI Joins, a router MUST be able to process all the S-PMSI Joins that fit into the datagram. 4. Wild Cards: S-PMSI A-D Routes & S-PMSI Join Messages As specified in [MVPN] and [MVPN-BGP], one can use an S-PMSI A-D route or an S-PMSI Join Message to assign a particular C-multicast flow, identified as (C-S,C-G), to a particular S-PMSI. However, [MVPN-BGP] does not specify any means of encoding wild cards ("*", in multicast terminology) in the Source or Group fields. Similarly, [MVPN] does not specify any means of encoding wild cards in the C-Source or C-Group fields of the S-PMSI Join messages. This omission makes it difficult to provide optimized multicast routing for customers that use ASM ("Any Source Multicast") multicasts, in which flows may be traveling along "shared" C-trees. We use the term "shared C-trees" to refer both to the the unidirectional "RPT trees" used in sparse mode, and to the bidirectional trees used in BIDIR-PIM [BIDIR-PIM]. When a customer is using ASM multicast, it is useful to be able to select the set of flows that are traveling along a shared C-tree, and to bind that entire set of flows to a specified P-tunnel. Conceptually, we would like to have a way to express that we want (C-*,C-G) traffic bound to the specified P-tunnel. A multicast data packet whose source address is C-S and whose destination address is Rosen, et al. [Page 9] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 an ASM group address is said to be traveling a shared C-tree from the perspective of a given router if that router's decision to forward the packet is based upon (C-*,C-G) state rather than upon (C-S,C-G) state. Creation and use of these multicast states is specified in [PIM] and/or [MVPN-BGP]. Another useful feature would be a way of using an S-PMSI A-D route to say "by default, all multicast traffic (within a given VPN) that has not been bound to any other P-tunnel is bound to the specified P-tunnel". To do this we, need to have a way to express that we want (C-*, C-*) traffic bound to the P-tunnel. This specification therefore establishes the following conventions: - In an S-PMSI A-D route, the use of a zero length source or group field is to be interpreted as specifying a wild card value for the respective field. A single wild card represents all Multicast Source or Multicast Group values of all address families; there is no need to use a different wild card for IPv4 addresses than is used for IPv6 addresses. - In an S-PMSI Join message, the use of an all-zero C-Source or C-Group field is to be interpreted as specifying a wild card value for the respective field. A wild card represents all C-Source or C-group values of a particular address family (IPv4 or IPv6), as specified by the S-PMSI Join message type. When wildcards are used, the following two combinations MUST BE supported: - (C-*,C-G): Source Wildcard, Group specified. - (C-*,C-*): Source Wildcard, Group Wildcard. This specification does not provide support for the combination of a specified source and a group wildcard. A received S-PMSI A-D route or S-PMSI Join message specifying this combination will be "ignored". 5. Binding Wild Cards to Unidirectional P-Tunnels Rosen, et al. [Page 10] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 5.1. Binding (C-*,C-G) to a Unidirectional P-Tunnel Consider an S-PMSI A-D Route whose NLRI specifies (C-*,C-G), and that contains a PTA that specifies a unidirectional P-tunnel. The P-tunnel may be a P2MP LSP, or it may be a unidirectional PIM-created multicast distribution tree specified either as (P-*,P-G) or as (P-S,P-G). Alternately, consider an S-PMSI Join message, whose C-Source and C-Group fields specify (C-*,C-G), and that specifies a unidirectional P-tunnel (either a P2MP LSP or a unidirectional PIM-created multicast distribution tree.) If C-G is known to be an SSM group address, the S-PMSI A-D route or S-PMSI Join message is "ignored". The semantics of binding (C-*,C-G) to a unidirectional P-tunnel are the following: the originator of the S-PMSI A-D route or S-PMSI Join message is saying that if it receives, over a VRF interface, any traffic that is traveling on the (C-*,C-G) shared tree, and if it is to forward such traffic to other PEs, then it will transmit such traffic on the specified P-tunnel. Any PE interested in receiving (C-*,C-G) traffic from the originator (i.e., if the originator is PE's upstream multicast hop for the (C-*,C-G) state) MUST join that P-tunnel. 5.2. Binding (C-*,C-*) to a Unidirectional P-Tunnel The originator of an S-PMSI A-D Route or an S-PMSI Join message that binds (C-*,C-*) to a unidirectional P-tunnel is saying that by default, if it is required by its C-PIM instance to forward multicast traffic to any other PE, then by default it will send the traffic on the specified tunnel. The default applies to any traffic that has not been explicitly assigned to another P-tunnel. This can be useful if BGP is used as the PE-PE control protocol and there is no MI-PMSI. Rosen, et al. [Page 11] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 6. S-PMSI Procedures for Using Bidirectional P-Tunnels 6.1. Bidirectional P-Tunnels This document specifies the use of two kinds of bidirectional P-tunnels: (a) MP2MP LSPs created using mLDP, and (b) BIDIR-PIM P-tunnels using GRE encapsulation. Whenever n PEs belong to a bidirectional P-tunnel, exactly one of them is considered to be the "root" of the P-tunnel. How the root is identified depends on the particular technology of the P-tunnel. A bidirectional P-tunnel is advertised only by its root. 6.1.1. MP2MP LSPs If the P-tunnel is an MP2MP LSP, the root is explicitly identified in the mLDP messages used to construct and join the P-tunnel [MLDP]. That is, in order for a PE to join an MP2MP LSP, the PE must know the root of the LSP. An MP2MP LSP may be advertised in the PTA of an S-PMSI A-D route, or in the FEC Element field of an S-PMSI Join message. In either case, the MP2MP LSP is identified by a "FEC element" that contains the IP address of the "root", followed by an "opaque value" that identifies the MP2MP LSP uniquely in the context of the root's IP address. This opaque value may be configured or autogenerated, and within an MVPN, there is no need for different roots to use the same opaque value. When PIM is used as the PE-PE control protocol, the root IP address MUST be the same IP address the root uses for sending and receiving PIM control messages. Whether the MP2MP LSP is advertised in the PTA of an S-PMSI A-D route, or in the FEC element field of an S-PMSI Join message, the advertisement MUST be originated by the PE that is the root (as specified in the "FEC element") of the MP2MP LSP. Any such advertisement that is not originated by the root MUST be "ignored". If the "ignored" advertisement is an S-PMSI A-D route, any MPLS label specified in its PTA MUST be ignored, and any PE Distinguisher Labels specified in the route MUST be ignored. Rosen, et al. [Page 12] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 6.1.2. BIDIR-PIM Each BIDIR-PIM tree is identified by a unique P-group address. The P-group address for a BIDIR-PIM P-tunnel must be configured at the PE that is to be the root of the P-tunnel. Associated with each such P-group address is a "Rendezvous Point Address" (RPA). Every PE that needs to join a particular BIDIR-PIM P-tunnel must be able to determine the RPA that corresponds to the P-tunnel's P-group address. This may be known through configuration, or by some automated means of RPA discovery. The RPA for a given P-group MUST uniquely identify the PE that is to be the root of the BIDIR-PIM tunnel. A BIDIR-PIM P-tunnel may be advertised in the PTA of an S-PMSI A-D route, or in the P-group field of an S-PMSI Join message. In either case, the advertisement MUST be originated by the root of the BIDIR-PIM tunnel. Any advertisement that is not originated by the root MUST be "ignored". If the "ignored" advertisement is an S-PMSI A-D route, any MPLS label specified in its PTA MUST be ignored, and any PE Distinguisher Labels specified in the route MUST be ignored. 6.2. General Procedures: MS-PMSIs According to the definition of S-PMSI in [MVPN], only a single PE can transmit onto a given S-PMSI. Note though that a single bidirectional P-tunnel containing n PEs can be used to instantiate n S-PMSIs, each of which has a different PE as its transmitter -- each PE can use the tunnel to transmit data to the other n-1 PEs. Therefore when a bidirectional P-tunnel is specified in an S-PMSI Join message or in the PTA of an S-PMSI A-D route, we consider the S-PMSI Join message or S-PMSI A-D route to be implicitly advertising a number of S-PMSIs: one for the PE that is advertising the P-tunnel, and one for each other PE that joins the P-tunnel. We will call the latter S-PMSIs the "implicitly advertised reverse S-PMSIs" (or just "reverse S-PMSIs"). When a bidirectional P-tunnel is specified in an S-PMSI Join message or in the PTA of an S-PMSI A-D route, we will use the term "MS-PMSI" to refer the set of S-PMSIs that (including the reverse S-PMSIs) that are thereby (explicitly or implicitly) advertised. If the PTA in the S-PMSI A-D route contains an MPLS label, then any PE that, as a result of having received that route, transmits a packet onto the MS-PMSI will first push that label onto the packet's label stack. The interpretation of that label when the packet is received is as specified in [MVPN] and [MVPN-BGP]. The use of this label allows multiple VPNs to share a single bidirectional P-tunnel. Rosen, et al. [Page 13] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 When MS-PMSIs are used to provide MVPN support (as detailed in subsequent sections), it is in general necessary to have more than one MS-PMSI per MVPN. There are two methods for using bidirectional P-tunnels to instantiate MS-PMSIs. In one method, a single bidirectional P-tunnel is used to instantiate all the MS-PMSIs of the MVPN. In the other method, multiple bidirectional P-tunnels are used. These two methods are considered separately. Which method is in use is a matter of provisioning. 6.3. Use of Multiple Bidirectional P-tunnels In this method, each PE attached to a given MVPN is potentially the root of a distinct bidirectional P-tunnel. Each such PE may advertise an MS-PMSI for which the originating PE is the root. In effect, each such PE advertises an MS-PMSI. We will sometimes refer to the MS-PMSIs as "partitions", and to the PE that advertised it as the root of the MS-PMSI or the root of the partition. This notion is useful both in support for BIDIR-PIM C-multicast traffic and for running PIM over MS-PMSI. Details are given in later sections. The procedures that follow presuppose when a packet is received from a bidirectional P-tunnel, it can be associated with one or more VRFs, and processed in the context of that VRF or VRFs. If the bidirectional P-tunnel was advertised in an S-PMSI Join message or in the PTA of an S-PMSI A-D route that did not specify an MPLS label, then all packets received from the P-tunnel are associated with the same set of VRFs. If the bidirectional P-tunnel was advertised in the PTA of an S-PMSI A-D route, and the PTA does specify an MPLS label, then received packets will carry a label that must be processed in order to determine the context. If the P-tunnel is a MP2MP LSP, this label appears below the label that identifies the LSP itself. 6.3.1. Binding (C-S,C-G) When PE1 advertises an S-PMSI A-D route that binds a (C-S,C-G) flow to a bidirectional P-tunnel, or when PE1 sends an S-PMSI Join message that binds a (C-S,C-G) flow to a bidirectional P-tunnel, the semantics are as follows. PE1 is stating that any (C-S,C-G) traffic that it needs to transmit to other PEs will be transmitted on the specified P-tunnel. Any other PE that needs to receive such traffic from PE1 (i.e., any other PE that needs to receive (C-S,C-G) traffic and which has selected PE1 as the upstream PE for C-S) MUST join that P-tunnel. Rosen, et al. [Page 14] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 If a PE has joined the P-tunnel, but does not need to receive the (C-S,C-G) traffic, or if it needs to receive (C-S,C-G) traffic but has not selected PE1 as the upstream PE for C-S, then the PE MUST discard any such received traffic. Please note that if PIM is being used as the multicast control protocol, any traffic that is discarded will not be seen by PIM, and hence will not cause the generation of Assert messages. 6.3.2. Binding (C-*,C-G) Flows from Unidirectional C-trees When PE1 advertises an S-PMSI A-D route or sends an S-PMSI Join message that binds (C-*,C-G) to a bidirectional P-tunnel, where C-G is not an SSM group, and the (C-*,C-G) traffic is traveling on a unidirectional shared C-tree, the semantics are as follows. PE1 is stating that any traffic to C-G that is traveling the shared C-tree and which PE1 needs to transmit to other PEs will be transmitted on the specified P-tunnel. Any other PE that needs to receive such traffic from PE1 (i.e., any other PE that needs to receive (C-*,C-G) traffic and which has selected PE1 as the upstream PE for the C-RP corresponding to the C-G group) MUST join that P-tunnel. If a PE has joined the P-tunnel, but does not need to receive the (C-*,C-G) traffic, or if it needs to receive (C-*,C-G) traffic but has not selected PE1 as the upstream PE for the C-RP that corresponds to C-G, then the PE MUST discard any such received traffic. Please note that if PIM is being used as the multicast control protocol, traffic that is discarded will not be seen by PIM. 6.3.3. Binding (C-*,C-G) Flows from Bidirectional C-trees When PE1 advertises an S-PMSI A-D route or sends an S-PMSI Join message that binds (C-*,C-G) to a bidirectional P-tunnel, where C-G is not an SSM group, and the (C-*,C-G) traffic is traveling on a bidirectional shared C-tree, the semantics are as follows: - PE1 is stating that any traffic to C-G that it (PE1) needs to send downstream will be sent on the specified P-tunnel - Any other PE that is interested in receiving (C-*,C-G) traffic MUST join the specified P-tunnel - Any other PE, say PE2, that (a) has traffic to C-G to send upstream and (b) has selected PE1 as its upstream PE for the C-RPA corresponding to C-G, MUST join the specified P-tunnel, and MUST send such traffic on the specified P-tunnel. (I.e., such traffic is bound to the MS-PMSI instantiated by the bidirectional Rosen, et al. [Page 15] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 P-tunnel that is rooted at PE2.) - If a PE, say PE3, has joined the specified P-tunnel, but does not need to receive the (C-*,C-G) traffic, or has not selected PE1 as the upstream PE for the C-RPA corresponding to C-G, then PE3 MUST NOT send any (C-*,C-G) traffic on that P-tunnel, and MUST discard any (C-*,C-G) traffic it received on that P-tunnel. These procedures implement, for S-PMSIs, the "partitioning" scheme described in section 11.2 of [MVPN], with each MS-PMSI being a "partition". The specification given so far requires an S-PMSI A-D route or an S-PMSI Join message to be sent for each (C-*,C-G) that is using a bidirectional C-tree. A more efficient method is given in the next section. 6.3.4. Binding (C-*,C-*) When PE1 advertises an S-PMSI A-D route or sends an S-PMSI Join message that binds (C-*,C-*) to a specified bidirectional P-tunnel of which PE1 is the root, the semantics are as that the bidirectional P-tunnel is to be used to carry C-multicast traffic in the following sets of cases: 1. If PE1 has (C-S,C-G) traffic that is traveling on a source-specific C-tree, and PE1 needs to transmit that data to one or more other PEs, and PE1 has not bound (C-S,C-G) or (C-*,C-G) to a different P-tunnel, then the (C-S,C-G) traffic is sent by PE1 on the specified bidirectional P-tunnel. 2. If PE1 has (C-*,C-G) traffic that is traveling on a unidirectional shared C-tree, and PE1 needs to transmit that data to one or more other PEs, and PE1 has not bound (C-*,C-G) to a different P-tunnel, then the (C-*,C-G) traffic is sent by PE1 on the specified bidirectional P-tunnel. 3. If PE1 has (C-*,C-G) traffic that is traveling on a bidirectional shared C-tree, and PE1 needs to transmit that data to one or more other PEs, and PE1 has not bound (C-*,C-G) to a different P-tunnel, then the (C-*,C-G) traffic is sent by PE1 on the specified bidirectional P-tunnel. 4. Consider some other PE, PE2, that has received the S-PMSI A-D route or S-PMSI Join message from PE1. If PE2 has (C-*,C-G) traffic that is traveling on a bidirectional shared C-tree, and PE2 needs to transmit that traffic UPSTREAM, and PE2 has Rosen, et al. [Page 16] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 selected PE1 as the upstream PE for the C-RPA corresponding to C-G, and PE1 has not bound (C-*,C-G) to any other P-tunnel, then the (C-*,C-G) traffic is sent by by PE2 on the specified bidirectional P-tunnel. 5. If a PE receives traffic from a particular MS-PMSI, and the traffic is traveling a unidirectional (C-*,C-G) or (C-S,C-G) tree, and the root of the MS-PMSI is not the PE's selected upstream PE for the (C-*,C-G) or (C-S,C-G), the PE MUST discard the traffic. 6. If a PE receives traffic from a particular MS-PMSI, and the traffic is traveling a bidirectional (C-*,C-G) tree, and the PE's selected upstream PE for the C-RPA corresponding to C-G is not the root of the MS-PMSI, then the PE MUST discard the traffic. With respect to traffic traveling a bidirectional C-tree, these procedures implement, for S-PMSIs, the "partitioning" scheme described in section 11.2 of [MVPN], without the need to send an S-PMSI A-D route for each (C-*,C-G) that is using a bidirectional C-tree. Each PE becomes the root of an MS-PMSI, and binds the double wildcard selector to it. The MS-PMSIs serve as the "partitions". The MS-PMSI rooted at PE1 becomes the default MS-PMSI for all traffic that PE1 needs to send downstream to other PEs. It also becomes the default MS-PMSI for all traffic that others PEs need to send upstream, as long as those other PEs have selected PE1 as the upstream PE for the C-RPA corresponding to that traffic. Note that other PEs SHOULD NOT join the specified bidirectional P-tunnel unless they have a need to send or receive data over it. A PE knows when it needs to receive data by virtue of having certain multicast state in its C-PIM instance. With regard to multicast data traveling on a bidirectional (C-*,C-G) tree, a PE may not know whether it has to send data until such data actually arrives over a VRF interface; the PE may be on a "sender-only" branch. However, the PE in this case would have to know, through provisioning or some automatic procedure such as "Bootstrap Routing Protocol for PIM" (BSR) [BSR], the set of C-RPAs that are being used to support (C-*,C-G) traffic. For each C-RPA, the PE could join the bidirectional P-tunnel advertised by its selected upstream PE for that C-RPA. Alternatively the PE could defer joining the P-tunnel until it actually has data to send. Rosen, et al. [Page 17] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 6.3.5. Default Tunnel Identifier for MP2MP LSPs To identify a MP2MP LSP, the S-PMSI Join message or the PMSI Tunnel Attribute of an S-PMSI A-D route contains an MP2MP FEC Element [mLDP] in its "Tunnel Identifier" field. This contains the IP address of the PE at the root of the LSP, as well as an "opaque value" which is unique at that PE. Each PMSI Tunnel is associated at its root PE with a particular VRF, and each VRF in a given PE has a unique default RD. Therefore one way to uniquely identify a MP2MP LSP is to use a MP2MP FEC Element whose Opaque Value length is 8 and whose Opaque Value value is the default RD of the associated VRF. This method of assigning a Tunnel Identifier MUST be the default method for any PMSI Tunnel which is bound to (C-*,C-*) traffic. Other methods MAY be available as well. Note that if aggregation of multiple VPNs onto a single default MS-PMSI is not being supported, this method of assigning the Tunnel Identifier allows each PE to algorithmically determine the Tunnel Identifier that has been assigned by a particular upstream PE. A PE decides to join a particular MS-PMSI because it has chosen that MS-PMSI's root as the upstream PE for a particular VPN-IP address. The RD of that VPN-IP address is the contents of the Opaque Value field of the corresponding MS-PMSI. 6.4. Single Bidirectional P-Tunnel When a single bidirectional P-tunnel is used for a given VPN (rather than multiple bidirectional P-tunnels), the PE at the root of the P-tunnel MUST advertise it in the PTA of an S-PMSI A-D root. The PE that is at the root of the P-tunnel MUST include a "PE Distinguisher Labels" attribute in either in its I-PMSI A-D route, or in the S-PMSI A-D route containing the PTA that identifies the P-tunnel. The PE MUST use the attribute to bind an upstream-assigned MPLS label to the IP address of each other PE that attaches to the same MVPN (as determined by the RTs of the A-D route). That is, the PE at the root of the P-tunnel assigns a distinct label to each of the other PEs attaching to the same MVPN. This set of PEs is learned via the reception of I-PMSI A-D routes. The procedures for using a single bidirectional P-tunnel differ from the procedures for using multiple bidirectional P-tunnels only in the following way. Let PE1 be the root of the P-tunnel. When a packet that is traveling on a unidirectional C-tree is transmitted on the P-tunnel by a particular PE, say PE2, PE2 must push on the packet's label stack the label that PE1 assigned to PE2 via the procedure above. When a packet that is traveling on a bidirectional C-tree is transmitted on the P-tunnel by PE2, PE2 must push on the packet's Rosen, et al. [Page 18] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 label stack the label that PE1 assigned to PE3, where PE3 is the upstream PE that PE2 has selected for the C-RPA corresponding to C-G. For unidirectional flows, this allows the transmitter to be identified, and for bidirectional flows, this allows the partition to be identified. Packets received from the wrong upstream PE or from the wrong partition MUST be discarded. (In effect, this is a case of tunnel hierarchy, where the PE Distinguisher Labels represent a set of MP2MP LSPs, each of which instantiates an MS-PMSI, but those LSPs are all tunneled through a single bidirectional P-tunnel.) If the PTA identifying the bidirectional P-tunnel contains an MPLS label, then that label shall appear in the label stack immediately preceding the label specified in the PE Distinguisher Labels attribute. 6.5. Other Methods of Instantiating an MS-PMSI Strictly speaking, what is required to instantiate an MS-PMSI is not that the P-tunnels be bidirectional, but that they provide an any-to-any multicast service for some subset of the PEs in the MVPN. One could, for instance, instantiate an MS-PMSI as a PIM sparse mode group. In this case, the PTA of the S-PMSI A-D routes would identify a "PIM-SM Tree". Every PE would have to advertise a PIM-SM tree with a distinct group address, and the PE and the PE advertising a given group address would be considered to be the "root" of the corresponding MS-PMSI. Generally speaking, this is not an efficient method of instantiating an MS-PMSI. However, it can be useful in certain circumstances, such as the "hub and spoke" MVPN discussed in section 10. 7. PIM over MS-PMSI [MVPN] provides two alternative means of distributing C-multicast routing information: PIM or BGP. Procedures for running PIM over MI-PMSI are specified in that document. However, a number of efficiencies can be obtained by running PIM instead over an MS-PMSI, instantiated as a set of MP2MP LSPs. The procedures for this are as follows. Each PE that attaches to a given MVPN MUST originate an Intra-AS I-PMSI A-D route that does NOT contain a PTA. Each such PE MUST also originate an S-PMSI A-D route whose PTA is a bidirectional P-tunnel rooted at the originating PE. This S-PMSI A-D MUST bind the LSP to the "double wildcard" (*,*). The use of these bidirectional Rosen, et al. [Page 19] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 P-tunnels for sending and receiving data traffic is as specified in the previous section. In effect, each PE in the MVPN has advertised an MS-PMSI for which it is the root. If PE1 needs to direct a PIM Join/Prune message to PE2, PE1 MUST join the PE2's MS-PMSI by joining the P-tunnel advertised in PE2's corresponding S-PMSI A-D route. The PIM J/P messages MUST be sent over that MS-PMSI. If PE1 does not need to direct a PIM Join/Prune message to PE2, then PE1 SHOULD NOT join the P-tunnel advertised in PE2's S-PMSI A-D route, as PE1 will not be receiving any multicast data on that LSP. Any PE that sends a PIM Join/Prune message on a given P-tunnel is automatically considered to be a PIM adjacency of every PE that receives the message on that P-tunnel. This implies that any PE receiving the LSP MUST accept a PIM Join/Prune message on that P-tunnel from any other PE, even if the PE that transmitted the Join/Prune messages has not previously transmitted a PIM Hello. That is, the "adjacency relationship" does not depend on the reception of PIM Hellos. PIM Hellos may still be useful for OAM purposes. Any PIM Hellos that PE1 sends MUST be sent on the P-tunnel advertised in PE1's S-PMSI A-D route above. Standard PIM procedures are used, except for: - The above change in the adjacency maintenance procedures. - Changes in the "RPF determination" or "RPF checking" procedures as may be defined in [MVPN] or in subsequent sections of this document (such as section 8.2). Note that the data handling procedures of the previous section will prevent PIM from ever seeing any packets that come from the wrong transmitter or that are in the wrong partition; when such packets are received they are discarded, rather than being passed to PIM's state machinery. As a result, such packets do not cause Asserts to be generated. Other standard PIM procedures, such as Join Suppression and Prune Override may come into play, however. By running PIM over MS-PMSI instead of over MI-PMSI, one completely avoids the need to have PEs join P-tunnels that would carry only control messages. A PE need not ever join a particular a P-tunnel unless it either has data to send on it, or needs to receive data on it. Rosen, et al. [Page 20] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 It is also possible to run PIM over MS-PMSI when a single bidirectional P-tunnel is used. In that case, the PE at the root of the P-tunnel MUST include a PE Distinguisher Labels attribute in its S-PMSI A-D route, and must assign a label to each of the other PEs that attach to the same MVPN. (This set is auto-discovered through the I-PMSI A-D routes.) When sending a PIM J/P packet, one must push onto its label stack the label identifying the PE to which the J/P packet is being directed. When receiving a PIM J/P packet, a PE discards any that are not carrying the PE distinguisher label that has been bound to its own IP address. All other MVPN-specific PIM procedures are as specified in [MVPN]. 8. Extranets using PIM as the MVPN Control Plane Suppose there are two VPNs. VPN1 consists of a set of VRFs, each of which has been configured with RT1 as it export and import Route Target. VPN2 consists of a set of VRFs, each of which has been configured with RT2 as it export and import Route Target. For convenience, we will use the term "blue" instead of "RT1" and the term "red" instead of "RT2". Thus we will call VPN1 the "blue VPN" and VPN2 the "red VPN". Similarly, the blue VPN consists of a number of "blue sites" containing "blue systems"; these sites are attached to PEs via VRF interfaces that are associated with "blue VRFs". We want to create an MVPN extranet in which blue receivers can join multicast groups whose sources and/or RPs are red. The first step is to ensure that the blue VRFs (or the subset of blue VRFs whose attached sites are allowed to receive multicasts from red sources) import routes to the red sources. This is done as follows: - The red VRFs are configured so that the subset of red routes that are to be part of the extranet are exported with a seconds RT value (call it RT3), as well as with RT2. For convenience, we will call RT3 "violet". - The blue VRFs are configured so that they import violet routes as well as blue routes. There are two different methods of providing the extranets, which will shall call the "red method" and the "blue method". (Remember that the red VPN contains the transmitter, and the blue VPN contains the receivers.) This document assumes that in the case of non-SSM extranet multicast groups, the mapping between a group address and an RP is Rosen, et al. [Page 21] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 pre-configured in the PEs. This document does not provide support for bidirectional C-trees in extranets. 8.1. Default PMSI Some of the procedures subsequently specified in this section are largely independent of whether PIM is used with (a) an MI-PMSI or (b) with an MS-PMSI that has been bound to the double wildcard. We will use the term "default PMSI" as a general term to mean either (a) or (b), depending upon which technique is actually being used in a given network. 8.2. Red method In the "red method", extranet multicasts are carried by default in the default PMSI of the red VPN, which we will of course call the "red PMSI". To use this method, blue VRFs must be configured to import "red" I-PMSI A-D routes and red S-PMSI A-D routes. If MI-PMSIs are being used, the blue VRFs must immediately join the P-tunnels specified in the red I-PMSI A-D routes. If MS-PMSIs are being used, a blue VRF need not join the MS-PMSI P-tunnel rooted at a particular PE unless a PIM Join needs to be sent to that PE. The PIM C-instance associated with a blue VRF will treat the red and blue default PMSIs as two different PIM interfaces. The blue VRFs must also be configured to "associate" violet unicast routes with the red default PMSI. What this means is that the red default PMSI will be considered to be the RPF interface for the violet unicast routes. The RPF interface for the blue unicast routes remains, as usual, the blue default PMSI. All that remains to be specified is how the control plane and data plane RPF checks are done. Apart from these MVPN-specific procedures for the RPF check, ordinary PIM procedures are used. Rosen, et al. [Page 22] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 8.2.1. Control Plane RPF Check Suppose a PE receives a PIM Join(S,G) from a CE, over a VRF interface that is associated with a blue VRF. The PE does the RPF check for S by looking up S in the blue VRF. If the route matching S is a blue route (i.e., carries the blue RT but not the violet RT), then a Join is sent over the blue default PMSI. However, if the route matching S is a violet route (i.e., carries the violet RT), a Join is sent over the red default PMSI. If the PE receives a PIM Join(*,G) from a CE, the RPF check is done against the address of the corresponding RP; otherwise the procedure is the same. 8.2.2. Data Plane RPF Check Suppose a red default PMSI has been associated with a blue VRF, as specified above, and an (S,G) multicast data packet is received from the red default PMSI. Then S is looked up in the (blue) VRF. If it matches a violet route, the packet is forwarded normally. However, if it matches a blue route, the packet is discarded as having failed the RPF check. This prevents the blue sites from receiving packets from red transmitters, except in the case where routes to the red receivers have been explicitly imported into the blue VRF. 8.3. Blue method In the "blue method", extranet multicasts are carried by default in the default PMSI of the blue VPN. In the blue method, the red VRFs must be configured to import "blue" I-PMSI and S-PMSI A-D routes. If MI-PMSIs are being used the P-tunnels specified therein must be joined immediately. If MS-PMSIs are being used, the P-tunnels need not be joined unless and until it is necessary to send a PIM Join to the root of the P-tunnel. The PIM C-instance associated with a red VRF will treat the red default PMSI and the blue default PMSI as two different PIM interfaces. PIM Joins from blue receivers are then received at the red VRF over the blue PMSI, whereas PIM Joins from red receivers are received at the red VRF over the red PMSI. As a result, PIM may add one or the other or both PMSIs to a particular multicast tree's olist. Rosen, et al. [Page 23] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 In this method, the blue VRFs are associated with only one default PMSI, so the RPF check for both blue and violet sources (and RPs) always resolves to that PMSI. Hence the special RPF check procedures of the red method are not necessary. However, a PE with a red VRF may need to transmit multicast traffic on more than one MI-PMSI. Note that since the data plane RPF check of section 8.2.2 is not needed, one does not really need a "violet" RT value. Rather, one may simply configure certain routes from the red VRF to be exported with both the red and the blue RTs. 8.4. Binding Specific Extranet C-Flows to S-PMSIs If the procedure of [MVPN] section 7.4.2 is used, the S-PMSI Join message MUST be sent on whatever default PMSI or default PMSIs are used to carry the C-flow identified in the message. If the procedure of [MVPN]section 7.4.1 is used, then procedures differ slightly depending upon whether the red method or the blue method is in use. If the red method is in use, and if a C-flow whose target source is exported from a red VRF is bound to an S-PMSI, then the S-PMSI A-D route that specifies the binding must carry both the red RT and the violet RT. Blue VRFs must be configured to import the violet S-PMSI A-D routes. If the blue method is in use, and if a C-flow whose target source is exported from a red VRF is bound to an S-PMSI, then the S-PMSI A-D route that specifies the binding: - must carry the red RT if the C-flow has any receivers on the red default PMSI, and - must carry the blue RT if the C-flow has any receivers on the blue default PMSI. 8.5. Two VRFs on One PE It is possible that a red VRF and a blue VRF will exist on the same PE. Then by the above procedures, one of these VRFs will need to join a PMSI that it can use for sending control packets to and receiving data packets from the other. However, the protocol used to construct the P-tunnels instantiating the PMSI may not provide a mechanism by which a given PE can join a P-tunnel of which it is the root. In this case, the PE implementation MUST support a local Rosen, et al. [Page 24] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 function whereby a given VRF, say VRF1, can "join" a P-tunnel whose root is another VRF, say VRF2, on the same PE. The PE MUST also support a local function whereby packets can be transmitted from one VRF to another just as if the VRFs had been on separate PEs. 9. Supporting Anycast Sources with PIM Control Plane Suppose that some customer site contains router C-R1 and some other customer site in the same VPN contains router C-R2. And that each sends a PIM Join(C-S,C-G) messages towards C-S. Ordinarily, the result will be to create a single C-tree whose root is C-S and whose leaves include C-R1 and C-R2. However, in some deployment scenarios, C-S may be an anycast address that belongs to two or more different sources, say C-S1 and C-S2. Let's suppose that these two sources attach to the VPN backbone through two different PEs, and let's further suppose that C-S1 is "close" to C-R1, and C-S2 is "close" to C-R2. Then even though both C-R1 and C-R2 send Join(S,G) messages, what is really desired is to create two C-trees, one rooted at C-S1 (with C-R1 as a leaf) and one rooted at C-S2 (with C-R2 as a leaf). If the data traffic traveling along both C-trees is carried on a single MI-PMSI, it is important that a (C-S,C-G) data packet is forwarded towards C-R1 only if the packet is actually traveling on the C-tree rooted at C-S1, and not on the C-tree rooted as C-S2. To ensure this, if a particular MVPN is providing anycast service, its PEs MUST use the procedure described in section 9.1.1 of [MVPN], and MUST NOT use the procedures described in sections 9.1.2 and 9.1.3 of [MVPN]. This also enables the use of C-RPs that have anycast addresses. Furthermore, if anycast source support is provided for a particular multicast group C-G, all PEs MUST execute the procedure described in section 4.2.1 of [PIM], and MUST act as if SwitchToSPTDesired(S,G) (defined in [PIM] section 4.2.1) is true when the first (S,G) packet (from any PE) is received. (This procedure MUST be executed by each PE even if the PE is not the "last hop" of the C-tree.) This will ensure that each PE receives and forwards (C-S,C-G) traffic from the appropriate source C-tree, even if PE has received only Join(C-*,C-G) messages but not Join(C-S,C-G) messages from its directly attached CEs. Rosen, et al. [Page 25] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 10. Hub and Spoke MVPNs The Layer 3 Virtual Private Network (L3VPN) technology of [RFC4364] generally provides an "any-to-any" network service, where any system at one site of a VPN can send traffic to and receive traffic from a system at any other site. Or more precisely, nothing in the procedures governing the distribution of routing information in the VPN prevents any-to-any communication. In some deployments, however, it has been convenient to distinguish between two kinds of VPN site, the "hub site" and the "spoke sites". In this section, we first describe how the "hub and spoke" configuration affects the distribution of unicast routing. We then specify a means of providing multicast VPN service in the hub and spoke configuration. 10.1. Unicast Hub and Spoke VPNs In a unicast hub and spoke VPN: - any system in a hub site can send traffic to and receive traffic from any other system in a hub site; - any system in a hub site can send traffic to and receive traffic from any system in a spoke site; - any system in a spoke site can send traffic to and receive traffic from any system in a hub site; - a system in one spoke site cannot send traffic to and cannot receive traffic from a system in a different spoke site. Using the technology of [RFC4364], it is possible to create this sort of "hub and spoke" VPN by suitable restricting the flow of routing information among the sites. One way to construct a hub and spoke VPN is as follows: - Within a given VPN, every site is denoted as either a hub site or a spoke site. - On a given PE, every spoke site is attached to a distinct VRF (i.e., all interfaces of that VRF lead to the same spoke site). We will call these "Spoke VRFs". Rosen, et al. [Page 26] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 - On a given PE, any number of hub sites can be attached to a single "Hub VRF". - Each Hub VRF is configured with an export-RT that we shall call "Hub_Route", and with a pair of import-RTs, one of which is "Hub_Route", and the other of which we shall call "Spoke_Route". (Of course, each hub and spoke VPN has its unique Hub_Route RT and its unique Spoke_Route RT.) - Each Spoke VRF is configured with export-RT "Spoke_Route" and import-RT "Hub_Route". With this configuration, the Spoke VRFs will contain only routes to systems at hub sites, whereas the Hub VRFs will contain routes to systems at both hub and spoke sites. Even if two spoke sites attach to the same PE, they cannot communicate directly, because they are associated with different VRFs, and their respective VRFs do not import each others' routes. (There are implementation techniques that can eliminate the need to configure a separate VRF for each spoke site on a PE, but these are out of scope of this document.) There are several different variations on this theme. For example, in a particular VPN, spoke-to-spoke communication may be allowed, but only if the spoke-to-spoke traffic first enters a hub site. Some system at the hub site would be responsible for "turning the traffic around", i.e., sending it back to VPN backbone for delivery to the target spoke site. This can be useful if the "turnaround system" at the hub site performs some sort of inspection of the spoke-to-spoke traffic and then applies authorization policies of some sort. To provide this sort of Hub and Spoke VPN: - The total set of routes exported by the Hub VRFs must include routes that "summarize" all the routes exported by the Spoke VRFs. For example, one or more Hub VRFs may export a default route. In the Hub VRFs, each of these summary routes will have one of the VRF interfaces as its next hop interface. - When such a summary route is exported as a VPN-IP route, it MUST be advertised with a label for which the Next Hop Label Forwarding Entry (see section 3.10 of [RFC3031]) specifies on of the VRF interfaces as the next hop interface. In this scenario, if a PE receives traffic from a spoke site, and the IP destination address of that traffic is a system in another spoke site, the traffic will be tunneled to a PE that attaches to a hub, and then sent over one of the Hub VRF's "VRF interfaces", i.e., sent to a Hub CE router. The Hub PE, when it receives the tunneled packet, does not look up the packet's IP destination address in the Rosen, et al. [Page 27] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 Hub VRF, but rather forwards based on the MPLS label. If the Hub CE decides (possibly after inspecting the packet and authorizing the transmission) to "turn the packet around", sending it back to the PE, the PE will look up the IP destination address in the Hub VRF, find that it matches one of the routes imported from a spoke VRF, and tunnel the packet to the PE attaches to the corresponding spoke site. Note that setting up a hub and spoke VPN is just a matter of proper configuration. There are no protocol differences between a Hub and Spoke VPN and any other kind of RFC 4364 VPN. 10.2. Multicast Hub and Spoke VPNs Sometimes it is necessary to support multicast service over a Hub and Spoke VPN. In this scenario, it is generally desired to provide an MVPN service with the following properties: - A receiver at a hub site may receive multicast traffic from a transmitter at a spoke site (including the case where the RP is at a spoke site) - A receiver at a spoke site may receive multicast traffic from a transmitter at a hub site (including the case where the RP is at a hub site) - A receiver at a spoke site must not be allowed to join a shared tree (i.e., a (C-*,C-G) tree whose root (i.e., the RP) is at a different spoke site. - A receiver at a spoke site must not be allowed to receive multicast traffic from a transmitter at a different spoke site, except possibly in the case where the traffic traverses a hub site on its path from one spoke site to the other. This type of MVPN service can be provided by using a variation of the "PIM over MS-PMSI" model described in section 7. In this model, each PE advertises an MS-PMSI for each VRF. If these advertisements are made using BGP S-PMSI A-D routes, the A-D route originating at a Hub VRF carries the "Hub_Route" RT; an A-D route originating at a spoke VRF carries the "Spoke_Route" RT. That is, the S-PMSI A-D routes originating at a given VRF carry the same RT as the unicast routes originating at that VRF. To support Hub and Spoke functionality, the MS-PMSIs originating at the spoke VRFs may all specify the same P-tunnel identifier. Similarly, the MS-PMSIs originating at the hub VRFs may all specify the same P-tunnel identifier, but this must be a different P-tunnel Rosen, et al. [Page 28] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 identifier than the one specified for the MS-PMSIs originating from the spoke VRFs. In this case, it is convenient to speak of the Hub and Spoke infrastructure as consisting of two MS-PMSIs, a "spoke-rooted" MS-PMSI and a "hub-rooted" MS-PMSI. As discussed in section 6.5, it is possible to instantiate an MS-PMSI as a set of PIM-SM trees. This means of instantiation can be useful in Hub and Spoke scenarios when GRE/PIM tunneling is used. In this case, for a given VPN, there MAY be a single sparse mode group address associated with the MS-PMSIs rooted at the spoke VRFs, and a second sparse mode group address associated with the MS-PMSIs rooted at the hub VRFs. The result is the creation of two distinct sets of P-tunnels for the VPN, one set used to carry data traffic from spoke sites to hub sites (and PIM control traffic in the opposite direction), and the other set used to carry data traffic from hub sites to spoke sites (and PIM control traffic in the opposite direction). Suppose that a spoke VRF and a hub VRF are on the same PE, and that an MS-PMSI advertisement exported by one of those VRFs is imported by the other. The PE implementation MUST support a local function whereby the importing VRF can "join" the MS-PMSI exported by the other VRF, and MUST support a local function whereby packets transmitted from one VRF onto the MS-PMSI are received by the other VRF (if and only if the latter VRF has joined the MS-PMSI exported by the former). Since spoke VRFs do not import each others' S-PMSI A-D routes, and do not import each other's unicast routes, and since there is no MI-PMSI, there is no way for a C-Join to be transmitted directly from one spoke VRF to another. If a CE at a spoke site sends a Join(S,G) to its PE, the PE will forward it on the hub-rooted MS-PMSI advertised by the hub site that is the BGP next hop for S; no spoke VRF can receive PIM control packets on that MS-PMSI. In this scheme, each hub VRF joins two MS-PMSIs, the one spoke-rooted MS-PMSI and the hub-rooted MS-PMSI. Normal PIM procedures would see these as two PIM interfaces. If a hub VRF at PE1 receives a Join(S,G) from the hub-rooted MS-PMSI, where S is at a spoke site, normal PIM/MVPN procedures would cause PE1 to send a Join(S,G) over the spoke-rooted PMSI towards a PE that attaches to S's site. If these procedures are followed, a receiver at a spoke site could get multicast data from a different spoke site; the data would get "turned around" at a PE that attaches to a hub site. Since this violates the requirements as stated above, a PE providing Hub and Spoke MVPN service MUST NOT send a Join message on one MS-PMSI as a result of having received a Join message over another. As a result, data traffic received by a hub PE on one of these MS-PMSIs will never Rosen, et al. [Page 29] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 get forwarded by that PE onto the other MS-PMSI. Note that this does not completely prevent a receiver in a spoke site from being able to receive multicast data from a transmitter in a different spoke site. For example, suppose: - A receiver R1 at a spoke site, Site1, joins a (C-*,C-G) tree, - The RP for (C-*,C-G) is at a hub site - A system S2 at a different spoke site, Site2, transmits multicast traffic to group C-G, - The hub site containing the RP is multiply connected to the SP backbone, - The best path from R1 to the RP enters the RP's hub site via a particular PE-CE link, link1, - The best path from S2 to the RP enters the RP's hub site via a different PE-CE link, link2. In this case, it is possible for multicast data traffic to travel from S2 to link1 to the RP to link2 to R1. If this is not desirable, the customer must ensure that transmitters at spoke sites do not send data to C-G addresses for which the RP is at a hub site. The procedures described in this section are compatible with the procedures of section 9. 11. PE-PE PIM/IPv6 over IPv4 P-Tunnel If a VPN customer is using PIM over IPv6, but the SP is using an IPv4 infrastructure (i.e., is using an IPv4-based control protocol to construct its P-tunnels), then the PE routers will need to originate PIM control messages in IPv6 form, and send such messages through the P-tunnels. The IPv6 Source Address field of any such IPv6 packet SHOULD be the IPv4-mapped IPv6 address [RFC4291] that corresponds to the IPv4 address that the originating PE router uses when participating in the protocol used to build the P-tunnels. If the IPv6 Destination Address field is the multicast address ALL-PIM- ROUTERS, the IPv6 form of the address (ff02::d) is used. Rosen, et al. [Page 30] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 12. IANA Considerations [MVPN] creates an IANA registry for the "S-PMSI Join Message Type Field". This document requires three new values: - The value 2 should be registered, and its description should read "mLDP P2MP S-PMSI for IPv4 traffic (unaggregated)". - The value 3 should be registered, and its description should read "mLDP P2MP S-PMSI for IPv6 traffic (unaggregated)". - The value 4 should be registered, and its description should read "GRE S-PMSI for IPv6 traffic (unaggregated)". 13. Security Considerations There are no additional security considerations beyond those of [MVPN] and [MVPN-BGP]. 14. Acknowledgments Rajesh Sharma contributed significantly to sections 9 and 10. We also thank Karthik Subramanian, DP Ayadevara, and Rayen Mohanty. 15. Authors' Addresses Arjen Boers E-mail: arjen@boers.com Yiqun Cai Cisco Systems, Inc. 170 Tasman Drive San Jose, CA, 95134 E-mail: ycai@cisco.com Rosen, et al. [Page 31] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 Eric C. Rosen Cisco Systems, Inc. 1414 Massachusetts Avenue Boxborough, MA, 01719 E-mail: erosen@cisco.com IJsbrand Wijnands Cisco Systems, Inc. De kleetlaan 6a Diegem 1831 Belgium E-mail: ice@cisco.com 16. Normative References [BIDIR-PIM] "Bidirectional Protocol Independent Multicast", Handley, Kouvelas, Speakman, Vicisano, RFC 5015, October 2007 [MLDP] "Label Distribution Protocol Extensions for Point-to-Multipoint and Multipoint-to-Multipoint Label Switched Paths", Minei, Kompella, Wijnands, Thomas, draft-ietf-mpls-ldp-p2mp-07.txt, July 2009 [MVPN] "Multicast in MPLS/BGP IP VPNs", Rosen, Aggarwal, et. al., draft-ietf-l3vpn-2547bis-mcast-08.txt, March 2009 [MVPN-BGP] "BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs", Aggarwal, Rosen, Morin, Rekhter, Kodeboniya, draft-ietf-l3vpn-2547bis-mcast-bgp-07.txt, April 2009 [PIM] "Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised)", Fenner, Handley, Holbrook, Kouvelas, RFC 4601, August 2006 [RFC2119] "Key words for use in RFCs to Indicate Requirement Levels.", Bradner, March 1997 [RFC4291] "IPv6 Addressing Architecture", Hinden, Deering, February 2006 [RFC3031] "MPLS Architecture", Rosen, Viswanathan, Callon, January 2001 [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006 Rosen, et al. [Page 32] Internet Draft draft-rosen-l3vpn-mvpn-mspmsi-05.txt August 2009 17. Informative References [BSR] "Bootstrap Router (BSR) Mechanism for PIM", N. Bhaskar, et.al., RFC 5059, January 2008 Rosen, et al. [Page 33]