Date Tags notes

Recently a friend of mine asked a question which can be summarized as

does the Route Distinguisher of Route Target play any role in BGP best path selection in MPLS BGP VPNs?

I'll try to answer this question based on the following scenario:
Reference lab diagram
We have two PE routers running IOS XR and three different VRFs.

On PE2 I have created a loopback with 10.10.11.0/24 subnet and placed it in both VRFs, V1001 and V1002. PE2 is exporting 10.10.11.0/24 from both VRFs with RT 100:1001 and 100:1002 respectively.

PE2:

vrf V1001
 address-family ipv4 unicast
  export route-target
   100:1001
!
vrf V1002
 address-family ipv4 unicast
  export route-target
   100:1002
  !
interface Loopback1001
 vrf V1001
 ipv4 address 10.10.11.1 255.255.255.0
!
interface Loopback1002
 vrf V1002
 ipv4 address 10.10.11.1 255.255.255.0
!
router bgp 100
!
 vrf V1001
 rd 172.16.16.2:1001
 address-family ipv4 unicast
 redistribute connected
 !
 !
 vrf V1002
 rd 172.16.16.2:6
 address-family ipv4 unicast
 redistribute connected

PE1 has VRF1003 which is importing both RTs.

PE1:

vrf V1003
 address-family ipv4 unicast
  import route-target
   100:1001
   100:1002

Both PE1 and PE2 have BGP VPNv4 sessions to Route Reflectors only.

Let's look at the outputs on PE1:

RP/0/RSP0/CPU0:PE1#show bgp vpnv4 unicast rd all 
Status codes: s suppressed, d damped, h history, * valid, > best
              i - internal, r RIB-failure, S stale, N Nexthop-discard
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network            Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 172.16.16.1:1003 (default for vrf V1003)
* i10.10.11.0/24      172.16.16.2              0    100      0 ?
*>i                   172.16.16.2              0    100      0 ?
Route Distinguisher: 172.16.16.2:1001
*>i10.10.11.0/24      172.16.16.2              0    100      0 ?
* i                   172.16.16.2              0    100      0 ?
Route Distinguisher: 172.16.16.2:6
*>i10.10.11.0/24      172.16.16.2              0    100      0 ?
* i                   172.16.16.2              0    100      0 ?

Since the 10.10.11.0/24 prefixes exported from PE2 have different RDs attached, they are both reflected by RR1 and RR2. Effectively PE1 needs to make best path selection for two prefixes

  • 172.16.16.2:1001:10.10.11.0/24
  • 172.16.16.2:6:10.10.11.0/24
RP/0/RSP0/CPU0:PE1#show bgp vpnv4 unicast rd 172.16.16.2:1001

Status codes: s suppressed, d damped, h history, * valid, > best
              i - internal, r RIB-failure, S stale, N Nexthop-discard
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network            Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 172.16.16.2:1001
*>i10.10.11.0/24      172.16.16.2                  0    100      0 ?
* i                   172.16.16.2                  0    100      0 ?
RP/0/RSP0/CPU0:PE1#show bgp vpnv4 unicast rd 172.16.16.2:6  
Status codes: s suppressed, d damped, h history, * valid, > best
              i - internal, r RIB-failure, S stale, N Nexthop-discard
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network            Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 172.16.16.2:6
*>i10.10.11.0/24      172.16.16.2                  0    100      0 ?
* i                   172.16.16.2                  0    100      0 ?

Since IGP distance to the next-hop is the same, cluster-lenght is the same, PE1 has to choose based on neighbor address.

RP/0/RSP0/CPU0:PE1#show bgp vpnv4 unicast rd 172.16.16.2:6 10.10.11.0
Tue Sep 15 02:08:00.578 UTC
BGP routing table entry for 10.10.11.0/24, Route Distinguisher: 172.16.16.2:6
Versions:
  Process           bRIB/RIB  SendTblVer
  Speaker            4850567     4850567
Last Modified: Sep 11 02:41:57.605 for 3d23h
Paths: (2 available, best #1)
  Not advertised to any peer
  Path #1: Received by speaker 0
  Not advertised to any peer
  Local, (received & used)
    172.16.16.2 (metric 36) from 172.16.16.11 (172.16.16.2)
      Received Label 24070
      Origin incomplete, metric 0, localpref 100, valid, internal, best, group-best, import-candidate, not-in-vrf
      Received Path ID 0, Local Path ID 1, version 4850567
      Extended community: RT:100:1002 
      Originator: 172.16.16.2, Cluster list: 172.16.16.11
  Path #2: Received by speaker 0
  Not advertised to any peer
  Local, (received & used)
    172.16.16.2 (metric 36) from 172.16.16.12 (172.16.16.2)
      Received Label 24070
      Origin incomplete, metric 0, localpref 100, valid, internal, not-in-vrf
      Received Path ID 0, Local Path ID 0, version 0
      Extended community: RT:100:1002 
      Originator: 172.16.16.2, Cluster list: 172.16.16.12

Lower is better, so in both cases the path reflected by RR1 is chosen.

Now the two prefixes received from RR1 are candidates for VRF import. But in order to do this, the router will need to make local copies of the routes with its own RD.

RP/0/RSP0/CPU0:PE1#show bgp vpnv4 unicast rd 172.16.16.1:1003 10.10.11.0
Tue Sep 15 02:13:04.066 UTC
BGP routing table entry for 10.10.11.0/24, Route Distinguisher: 172.16.16.1:1003
Versions:
  Process           bRIB/RIB  SendTblVer
  Speaker            4850576     4850576
Last Modified: Sep 11 02:44:38.606 for 3d23h
Paths: (2 available, best #2)
  Not advertised to any peer
  Path #1: Received by speaker 0
  Not advertised to any peer
  Local, (received & used)
    172.16.16.2 (metric 36) from 172.16.16.11 (172.16.16.2)
      Received Label 24070
      Origin incomplete, metric 0, localpref 100, valid, internal, imported
      Received Path ID 0, Local Path ID 0, version 0
      Extended community: RT:100:1002 
      Originator: 172.16.16.2, Cluster list: 172.16.16.11
      Source VRF: default, Source Route Distinguisher: 172.16.16.2:6
  Path #2: Received by speaker 0
  Not advertised to any peer
  Local, (received & used)
    172.16.16.2 (metric 36) from 172.16.16.11 (172.16.16.2)
      Received Label 24069
      Origin incomplete, metric 0, localpref 100, valid, internal, best, group-best, import-candidate, imported
      Received Path ID 0, Local Path ID 1, version 4850576
      Extended community: RT:100:1001 
      Originator: 172.16.16.2, Cluster list: 172.16.16.11
      Source VRF: default, Source Route Distinguisher: 172.16.16.2:1001

But how does the PE1 decide which path is best? Despite all attributes are equal, PE1 has somehow chosen the second path as best.

Note that although the route is now tied to local RD 172.16.16.1:1003, the source RDs have not been lost. It it now reads "Source Route Distinguisher"

Let's now try to flip the RDs on PE2 so the highest one become the lowest.

PE2:

vrf V1001
  rd 172.16.16.2:0
  address-family ipv4 unicast
   redistribute connected
  !
 !
 vrf V1002
  rd 172.16.16.2:6
  address-family ipv4 unicast
   redistribute connected
  !

Let's see if something has changed.

RP/0/RSP0/CPU0:PE1R#show bgp vrf V1003 10.10.11.0
Tue Sep 15 02:52:33.016 UTC
BGP routing table entry for 10.10.11.0/24, Route Distinguisher: 172.16.16.1:1003
Versions:
  Process           bRIB/RIB  SendTblVer
  Speaker            4850591     4850591
Last Modified: Sep 15 02:50:44.611 for 00:01:48
Paths: (2 available, best #1)
  Not advertised to any peer
  Path #1: Received by speaker 0
  Not advertised to any peer
  Local, (received & used)
    172.16.16.2 (metric 36) from 172.16.16.11 (172.16.16.2)
      Received Label 24069
      Origin incomplete, metric 0, localpref 100, valid, internal, best, group-best, import-candidate, imported
      Received Path ID 0, Local Path ID 1, version 4850591
      Extended community: RT:100:1001 
      Originator: 172.16.16.2, Cluster list: 172.16.16.11
      Source VRF: default, Source Route Distinguisher: 172.16.16.2:0
  Path #2: Received by speaker 0
  Not advertised to any peer
  Local, (received & used)
    172.16.16.2 (metric 36) from 172.16.16.11 (172.16.16.2)
      Received Label 24070
      Origin incomplete, metric 0, localpref 100, valid, internal, imported
      Received Path ID 0, Local Path ID 0, version 0
      Extended community: RT:10:1002 
      Originator: 172.16.16.2, Cluster list: 172.16.16.11
      Source VRF: default, Source Route Distinguisher: 172.16.16.2:6

The route we have just modified is still selected as best, even though now the RD is the lowest one. Let's do another try and modify the Source RD of the other route, which isn't currently marked as best.

PE2:

vrf V1002
  rd 172.16.16.2:1002
  address-family ipv4 unicast
   redistribute connected
  !
RP/0/RSP0/CPU0:PE1#show bgp vrf V1003 10.10.11.0
Tue Sep 15 04:32:53.241 UTC

BGP routing table entry for 10.10.11.0/24, Route Distinguisher: 172.16.16.1:1003
Versions:
  Process           bRIB/RIB  SendTblVer
  Speaker            4850642     4850642
Last Modified: Sep 15 04:32:52.623 for 00:00:00
Paths: (2 available, best #2)
  Not advertised to any peer
  Path #1: Received by speaker 0
  Not advertised to any peer
  Local, (received & used)
    172.16.16.2 (metric 36) from 172.16.16.11 (172.16.16.2)
      Received Label 24069
      Origin incomplete, metric 0, localpref 100, valid, internal, imported
      Received Path ID 0, Local Path ID 0, version 0
      Extended community: RT:100:1001 
      Originator: 172.16.16.2, Cluster list: 172.16.16.11
      Source VRF: default, Source Route Distinguisher: 172.16.16.2:0
  Path #2: Received by speaker 0
  Not advertised to any peer
  Local, (received & used)
    172.16.16.2 (metric 36) from 172.16.16.11 (172.16.16.2)
      Received Label 24070
      Origin incomplete, metric 0, localpref 100, valid, internal, best, group-best, import-candidate, imported
      Received Path ID 0, Local Path ID 1, version 4850642
      Extended community: RT:100:1002 
      Originator: 172.16.16.2, Cluster list: 172.16.16.11
      Source VRF: default, Source Route Distinguisher: 172.16.16.2:1002

Ah, so now the other route became best. So it looks like here we have the answer.

Conclusion

In a scenario where neighbor address is the same and all previous steps in BGP best path algorithm didn't break the tie, best path is chosen on the age of the route - the one which has been learned last is chosen as best.

This doesn't seem to be platform dependent. I did similar test on Junos based platform (MX series) and result was the same.


Comments

comments powered by Disqus