Cisco BGP 不等价负载平衡

Sha*_*nu4 9 cisco load-balancing bgp

我正在尝试在我的网络中实现 BGP 不等成本负载平衡功能。根据思科手册(长:http : //www.cisco.com/c/en/us/td/docs/ios/12_2s/feature/guide/fsbgplb.html,短:https : //ccieblog.co.uk /bgp/bgp-unequal-load-cost-sharing)我已经建立了这样的网络拓扑:

网络拓扑

R1 - 我试图为传出流量实现负载平衡的路由器。nat使用带有名称的 VRF 表。

R2-R4 - 运行 quagga 的 NAT 服务器,通过 eBGP与R1共享到R5 的默认路由。

R1配置

R1 IOS版本: 12.2(33)SXJ4 (s72033-adventerprisek9_wan-mz.122-33.SXJ4.bin)

R2 配置R3 R4仅 router-id 和 vlan 不同)

结果我在R1上有 3 个不同的默认路由,它们具有相同的共享数 - 1/1 (1:1:1)。但比例1:2:3 expexted:

R1# sh ip bgp vpnv4 vrf nat 0.0.0.0

Paths: (6 available, best #5, table nat)
Multipath: eiBGP
  Advertised to update-groups:
     2         
  65000
    10.30.227.227 from 10.30.227.227 (10.30.227.227)
      Origin IGP, localpref 100, valid, external, multipath
      Extended Community: RT:192.168.33.4:13
      DMZ-Link Bw 250 kbytes
  65000, (received-only)
    10.30.227.227 from 10.30.227.227 (10.30.227.227)
      Origin IGP, localpref 100, valid, external
      DMZ-Link Bw 250 kbytes
  65000
    10.30.228.228 from 10.30.228.228 (10.30.228.228)
      Origin IGP, localpref 100, valid, external, multipath
      Extended Community: RT:192.168.33.4:13
      DMZ-Link Bw 375 kbytes
  65000, (received-only)
    10.30.228.228 from 10.30.228.228 (10.30.228.228)
      Origin IGP, localpref 100, valid, external
      DMZ-Link Bw 375 kbytes
  65000
    10.30.225.225 from 10.30.225.225 (10.30.225.225)
      Origin IGP, localpref 100, valid, external, multipath, best
      Extended Community: RT:192.168.33.4:13
      DMZ-Link Bw 125 kbytes
  65000, (received-only)
    10.30.225.225 from 10.30.225.225 (10.30.225.225)
      Origin IGP, localpref 100, valid, external
      DMZ-Link Bw 125 kbytes
Run Code Online (Sandbox Code Playgroud)

R1# sh ip cef vrf nat 0.0.0.0/0 internal

0.0.0.0/0, epoch 3, flags rib only nolabel, rib defined all labels, RIB[B], refcount 7, per-destination sharing
  sources: RIB, D/N, DRH
  feature space:
   NetFlow: Origin AS 0, Peer AS 0, Mask Bits 0
   Broker: linked
   IPRM: 0x00018000
  subblocks:
   DefNet source: 0.0.0.0/0
  ifnums:
   Vlan3225(231): 10.30.225.225
   Vlan3227(232): 10.30.227.227
   Vlan3228(233): 10.30.228.228
  path 541B7858, path list 53E3E0D8, share 1/1, type recursive nexthop, for IPv4, flags resolved
  recursive via 10.30.225.225[IPv4:nat], fib 5496C804, 1 terminal fib
    path 541B7BF8, path list 53E3E170, share 1/1, type adjacency prefix, for IPv4
    attached to Vlan3225, adjacency IP adj out of Vlan3225, addr 10.30.225.225 513F6B60
  path 541B78CC, path list 53E3E0D8, share 1/1, type recursive nexthop, for IPv4, flags resolved
  recursive via 10.30.227.227[IPv4:nat], fib 54969B7C, 1 terminal fib
    path 541B7B10, path list 53E3E08C, share 1/1, type adjacency prefix, for IPv4
    attached to Vlan3227, adjacency IP adj out of Vlan3227, addr 10.30.227.227 513F66E0
  path 541B7DC8, path list 53E3E0D8, share 1/1, type recursive nexthop, for IPv4, flags resolved
  recursive via 10.30.228.228[IPv4:nat], fib 54970EAC, 1 terminal fib
    path 541B79B4, path list 53E3E040, share 1/1, type adjacency prefix, for IPv4
    attached to Vlan3228, adjacency IP adj out of Vlan3228, addr 10.30.228.228 513F6560
  output chain:
    loadinfo 51283B80, per-session, 3 choices, flags 0003, 5 locks
    flags: Per-session, for-rx-IPv4
    15 hash buckets
      < 0 > IP adj out of Vlan3225, addr 10.30.225.225 513F6B60
      < 1 > IP adj out of Vlan3227, addr 10.30.227.227 513F66E0
      < 2 > IP adj out of Vlan3228, addr 10.30.228.228 513F6560
      < 3 > IP adj out of Vlan3225, addr 10.30.225.225 513F6B60
      < 4 > IP adj out of Vlan3227, addr 10.30.227.227 513F66E0
      < 5 > IP adj out of Vlan3228, addr 10.30.228.228 513F6560
      < 6 > IP adj out of Vlan3225, addr 10.30.225.225 513F6B60
      < 7 > IP adj out of Vlan3227, addr 10.30.227.227 513F66E0
      < 8 > IP adj out of Vlan3228, addr 10.30.228.228 513F6560
      < 9 > IP adj out of Vlan3225, addr 10.30.225.225 513F6B60
      <10 > IP adj out of Vlan3227, addr 10.30.227.227 513F66E0
      <11 > IP adj out of Vlan3228, addr 10.30.228.228 513F6560
      <12 > IP adj out of Vlan3225, addr 10.30.225.225 513F6B60
      <13 > IP adj out of Vlan3227, addr 10.30.227.227 513F66E0
      <14 > IP adj out of Vlan3228, addr 10.30.228.228 513F6560
    Subblocks:
     None
Run Code Online (Sandbox Code Playgroud)

我究竟做错了什么?根据手册,不同的dmzlink bw值应该导致不同的负载分配比例,但实际上 - 事实并非如此!


更新 1 - 用户 bangal 请求

R1# show ip bgp all summary

For address family: IPv4 Unicast
BGP router identifier X.X.X.129, local AS number 41096
BGP table version is 22283352, main routing table version 22283352
34749 network entries using 4065633 bytes of memory
61661 path entries using 3206372 bytes of memory
8119/5337 BGP path/bestpath attribute entries using 1299040 bytes of memory
3752 BGP AS-PATH entries using 155474 bytes of memory
2990 BGP community entries using 138266 bytes of memory
146 BGP extended community entries using 5168 bytes of memory
53 BGP route-map cache entries using 1696 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 8871649 total bytes of memory
BGP activity 4716897/4682147 prefixes, 11331539/11269872 paths, scan interval 60 secs

# Here are bgp neighbours from global routing table. Not relevant to the question. IP addresses are hidden 

Neighbor     V       AS    MsgRcvd   MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
X.X.X.1      4       XX219    791704  760380 22283352    0    0 6d17h           1
X.X.X.33     4       XX219 112902498 1315655 22283352    0    0 6d17h           0
X.X.X.238    4       XX772    801422  762830 22283352    0    0 2w5d            0
X.X.X.206    4       XX540   2886112 1313917 22283352    0    0 4w4d         9641
X.X.X.70     4       XX772 188343075 1313853 22283352    0    0 6d14h       25881
X.X.X.78     4       XX772 148265282  941127 22283352    0    0 2w6d        26098

# Here are neighbours for vrf nat.

For address family: VPNv4 Unicast
BGP router identifier X.X.X.129, local AS number 41096
BGP table version is 824, main routing table version 824
1 network entries using 137 bytes of memory
6 path entries using 408 bytes of memory
1 multipath network entries and 3 multipath paths
8119/1 BGP path/bestpath attribute entries using 1299040 bytes of memory
3752 BGP AS-PATH entries using 155474 bytes of memory
2990 BGP community entries using 138266 bytes of memory
146 BGP extended community entries using 5168 bytes of memory
53 BGP route-map cache entries using 1696 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 1600189 total bytes of memory
3 received paths for inbound soft reconfiguration
BGP activity 4716897/4682147 prefixes, 11331539/11269872 paths, scan interval 15 secs

Neighbor        V          AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.30.225.225   4       65000   11003   11443      824    0    0 3d18h           1
10.30.227.227   4       65000    9853   10293      824    0    0 3d18h           1
10.30.228.228   4       65000   10992   11432      824    0    0 3d18h           1
Run Code Online (Sandbox Code Playgroud)

R1# sh ip route vrf nat

Routing Table: nat
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route

Gateway of last resort is 10.30.228.228 to network 0.0.0.0

     10.0.0.0/24 is subnetted, 4 subnets
C       10.30.0.0 is directly connected, Vlan30
C       10.30.228.0 is directly connected, Vlan3228
C       10.30.227.0 is directly connected, Vlan3227
C       10.30.225.0 is directly connected, Vlan3225
B*   0.0.0.0/0 [20/0] via 10.30.228.228, 3d18h
               [20/0] via 10.30.227.227, 3d18h
               [20/0] via 10.30.225.225, 3d18h
Run Code Online (Sandbox Code Playgroud)

R1# sh ip bgp vpnv4 vrf nat neighbors

R1 sh ip bgp 邻居输出

R1# sh run

R1 运行配置 敏感信息被屏蔽

And*_*gin 3

关键问题似乎是bgp dmzlink-bw配置中地址系列下缺少选项。不过,让我在这里总结一下我的评论:

  1. bgp dmzlink-bw在下面address-familyneighbor dmzlink-bw仅启用向邻居通告带宽,同时bgp dmzlink-bw启用比例负载平衡本身。
  2. 运行配置bandwidth 50000缺少“接口 Vlan3228”选项
  3. 如本配置示例中所述,maximum-paths eibgp 3可能需要选项而不是maximum-paths 3
  4. 除了sh ip bgp vpnv4 vrf nat 0.0.0.0Shamanu4 和 bangal 原始指南中提到的其他命令(请参阅问题)之外,使用以下命令检查流量共享计数对于负载平衡的链接是否不同很有用sh ip route vrf nat 0.0.0.0
  5. 检查是否没有其他选项可能干扰负载平衡的配置(例如,bandwidth inherit在端口通道上)

作为一般建议,当您有一个包含很多选项的大型运行配置时,有时很难识别问题。如果问题仍然存在,我将使用空配置创建类似的设置,并尝试配置相关选项(最小工作示例),以查看它是否有效并且不会干扰其他选项、​​访问列表(例如,它在这种特殊情况下极不可能)等等。如果您没有备用硬件,并且您的路由器正在生产中,因此您无法直接在其上进行空配置实验,您可以:

  • 将 Linux PC/VM 与 Quagga 等路由软件一起使用(问题中提到)
  • 使用Cisco的模拟器:Boson NetSim for CCNP支持BGP,但是,我不确定是否支持地址族/VPN/VRF
  • 使用带有 Cisco IOS XRv 的虚拟机。据我记得,它是免费提供的,带宽限制为 2 Mbit/s,这对于测试来说应该足够了。再次,我不确定是否支持地址族/VPN/VRF:Cisco IOS XRv 路由器概述VM 下载链接
  • 使用GNS3(http://www.gns3.com/)模拟器。有它的 Cisco IOS 映像,但是我不知道如何获取它们。
  • 最后,您甚至可以尝试从 eBay 等地方尽可能便宜地购买二手硬件,仅用于测试目的。