You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Did you test it on the latest FRRouting/frr master branch?
FRR is set up to peer with Arista switches to exchange IPv4 routes over IPv6 Link-Local BGP sessions (RFC5549).
This stops working sometimes, mostly after restarting FRR it seems. When this happens I noticed the following:
BGP neighbor output no longer shows extended-next-hop capability
Neighbor output displays that extended nexthop is received but not advertised:
client1rt# show bgp neighbors fabric0
BGP neighbor on fabric0: fe80::d6af:f7ff:fe91:46db, remote AS 4209900005, local AS 4209901001, external link
Member of peer-group EVPN-FABRIC for session parameters
BGP version 4, remote router ID 10.60.196.15, local router ID 10.60.197.1
BGP state = Established, up for 00:00:15
Last read 00:00:15, Last write 00:00:13
Hold time is 180, keepalive interval is 60 seconds
Neighbor capabilities:
4 Byte AS: advertised and received
Extended Message: advertised
AddPath:
IPv4 Unicast: RX advertised and received
Extended nexthop: received <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Address families by peer:
IPv4 Unicast
Long-lived Graceful Restart: advertised
Route refresh: advertised and received(new)
Enhanced Route Refresh: advertised and received
Address Family IPv4 Unicast: advertised and received
Hostname Capability: advertised (name: client1rt,domain name: n/a) not received
Graceful Restart Capability: advertised and received
Remote Restart timer is 300 seconds
Address families by peer:
none
Graceful restart information:
End-of-RIB send: IPv4 Unicast
End-of-RIB received: IPv4 Unicast
Local GR Mode: Helper*
Remote GR Mode: Helper
R bit: False
Timers:
Configured Restart Time(sec): 120
Received Restart Time(sec): 300
IPv4 Unicast:
F bit: False
End-of-RIB sent: Yes
End-of-RIB sent after update: Yes
End-of-RIB received: Yes
Timers:
Configured Stale Path Time(sec): 360
Message statistics:
Inq depth is 0
Outq depth is 0
Sent Rcvd
Opens: 2 2
Notifications: 2 0
Updates: 6 20
Keepalives: 4 7
Route Refresh: 0 0
Capability: 0 0
Total: 14 29
Minimum time between advertisement runs is 0 seconds
For address family: IPv4 Unicast
EVPN-FABRIC peer-group member
Update group 1, subgroup 1
Packet Queue length 0
Inbound soft reconfiguration allowed
Community attribute sent to this neighbor(all)
Inbound path policy configured
Outbound path policy configured
Route map for incoming advertisements is *PERMIT-ANY
Route map for outgoing advertisements is *LOCAL-LOOPBACKS
0 accepted prefixes
Maximum prefixes allowed 10000
Threshold for warning message 75%
Connections established 2; dropped 1
Last reset 00:00:16, No AFI/SAFI activated for peer
Local host: fe80::a236:9fff:fe3e:509a, Local port: 179
Foreign host: fe80::d6af:f7ff:fe91:46db, Foreign port: 46323
Nexthop: 10.60.197.1
Nexthop global: fe80::a236:9fff:fe3e:509a
Nexthop local: fe80::a236:9fff:fe3e:509a
BGP connection: shared network
BGP Connect Retry Timer in Seconds: 120
Peer Authentication Enabled
Read thread: on Write thread: on FD used: 27
Also in the BGP router config the neighbor suddenly has no neighbor fabric0 capability extended-nexthop
even though it is active through the peer-group and was not configured by me. The line just turns up for all neighbors in the peer group.
Even when I reactive the capability with neighbor fabric0 capability extended-nexthop the problem persists after BGP is reset. Some combination of restarting FRR and changing configuration then fixes this again, but I can't make out a pattern.
Output on the Arista side shows an error when BGP is established and also has routes with the wrong next-hop:
Apr 27 16:26:26 leaf1 Bgp: %BGP-3-DROP_TXUPDATE: Dropped updates for peer fe80::a236:9fff:fe3e:509a%Et2 (VRF default AS 4209901001) because a local Nexthop was not configured for AFI/SAFI IPv4/Unicast (message repeated 2 times in 78.1729 secs)
#show bgp neighbors fe80::a236:9fff:fe3e:509a%Et2 ipv4 unicast received-routes
BGP routing table information for VRF default
Router identifier 10.60.196.15, local AS number 4209900005
Route status codes: s - suppressed, * - valid, > - active, E - ECMP head, e - ECMP
S - Stale, c - Contributing to ECMP, b - backup, L - labeled-unicast
% - Pending BGP convergence
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI Origin Validation codes: V - valid, I - invalid, U - unknown
AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop
Network Next Hop Metric AIGP LocPref Weight Path
10.60.197.1/32 10.60.197.1 0 - - - 4209901001 ?
10.61.197.1/32 10.60.197.1 0 - - - 4209901001 ?
To Reproduce
Not sure how to reproduce, problem occurs pretty often, mostly after restarting FRR.
Expected behavior
When it works next-hop is IPv6 on the Arista side as expected:
#show bgp neighbors fe80::a236:9fff:fe3e:509a%Et2 ipv4 unicast received-routes
BGP routing table information for VRF default
Router identifier 10.60.196.15, local AS number 4209900005
Route status codes: s - suppressed, * - valid, > - active, E - ECMP head, e - ECMP
S - Stale, c - Contributing to ECMP, b - backup, L - labeled-unicast
% - Pending BGP convergence
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI Origin Validation codes: V - valid, I - invalid, U - unknown
AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop
Network Next Hop Metric AIGP LocPref Weight Path
* > 10.60.197.1/32 fe80::a236:9fff:fe3e:509a%Et2 0 - - - 4209901001 ?
* > 10.61.197.1/32 fe80::a236:9fff:fe3e:509a%Et2 0 - - - 4209901001 ?
Also in FRR the extended next-hop is advertised:
client1rt# show bgp neighbors fabric0
BGP neighbor on fabric0: fe80::d6af:f7ff:fe91:46db, remote AS 4209900005, local AS 4209901001, external link
Member of peer-group EVPN-FABRIC for session parameters
BGP version 4, remote router ID 10.60.196.15, local router ID 10.60.197.1
BGP state = Established, up for 00:14:44
Last read 00:00:05, Last write 00:00:44
Hold time is 180, keepalive interval is 60 seconds
Neighbor capabilities:
4 Byte AS: advertised and received
Extended Message: advertised
AddPath:
IPv4 Unicast: RX advertised and received
Extended nexthop: advertised and received <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Address families by peer:
IPv4 Unicast
Long-lived Graceful Restart: advertised
Route refresh: advertised and received(new)
Enhanced Route Refresh: advertised and received
Address Family IPv4 Unicast: advertised and received
Hostname Capability: advertised (name: client1rt,domain name: n/a) not received
Graceful Restart Capability: advertised and received
Remote Restart timer is 300 seconds
Address families by peer:
none
Graceful restart information:
End-of-RIB send: IPv4 Unicast
End-of-RIB received: IPv4 Unicast
Local GR Mode: Helper*
Remote GR Mode: Helper
R bit: False
Timers:
Configured Restart Time(sec): 120
Received Restart Time(sec): 300
IPv4 Unicast:
F bit: False
End-of-RIB sent: Yes
End-of-RIB sent after update: Yes
End-of-RIB received: Yes
Timers:
Configured Stale Path Time(sec): 360
Message statistics:
Inq depth is 0
Outq depth is 0
Sent Rcvd
Opens: 1 1
Notifications: 0 0
Updates: 3 12
Keepalives: 15 19
Route Refresh: 0 0
Capability: 0 0
Total: 19 32
Minimum time between advertisement runs is 0 seconds
For address family: IPv4 Unicast
EVPN-FABRIC peer-group member
Update group 1, subgroup 1
Packet Queue length 0
Inbound soft reconfiguration allowed
Community attribute sent to this neighbor(all)
Inbound path policy configured
Outbound path policy configured
Route map for incoming advertisements is *PERMIT-ANY
Route map for outgoing advertisements is *LOCAL-LOOPBACKS
17 accepted prefixes
Maximum prefixes allowed 10000
Threshold for warning message 75%
Connections established 1; dropped 0
Last reset 00:15:53, Waiting for peer OPEN
Local host: fe80::a236:9fff:fe3e:509a, Local port: 41550
Foreign host: fe80::d6af:f7ff:fe91:46db, Foreign port: 179
Nexthop: 10.60.197.1
Nexthop global: fe80::a236:9fff:fe3e:509a
Nexthop local: fe80::a236:9fff:fe3e:509a
BGP connection: shared network
BGP Connect Retry Timer in Seconds: 120
Estimated round trip time: 1 ms
Peer Authentication Enabled
Read thread: on Write thread: on FD used: 29
…12453)
Fixing issue FRRouting/frr#11108
For interface based peers with peer-groups, "no neighbor capability extended-nexthop" gets added by default. This will result in IPv4 routes not having ipv6 next hops.
- How I did it
Porting the commit FRRouting/frr@8e89adc to FRR 8.2.2 which fixes the issue
- How to verify it
Load FRR and verify if the "no neighbor capability extended-nexthop" not gets added for interfaces associated with peer-groups
yxieca
pushed a commit
to sonic-net/sonic-buildimage
that referenced
this issue
Oct 25, 2022
…12453)
Fixing issue FRRouting/frr#11108
For interface based peers with peer-groups, "no neighbor capability extended-nexthop" gets added by default. This will result in IPv4 routes not having ipv6 next hops.
- How I did it
Porting the commit FRRouting/frr@8e89adc to FRR 8.2.2 which fixes the issue
- How to verify it
Load FRR and verify if the "no neighbor capability extended-nexthop" not gets added for interfaces associated with peer-groups
Describe the bug
FRR is set up to peer with Arista switches to exchange IPv4 routes over IPv6 Link-Local BGP sessions (RFC5549).
This stops working sometimes, mostly after restarting FRR it seems. When this happens I noticed the following:
BGP neighbor output no longer shows extended-next-hop capability
Neighbor output displays that extended nexthop is received but not advertised:
Also in the BGP router config the neighbor suddenly has
no neighbor fabric0 capability extended-nexthop
even though it is active through the peer-group and was not configured by me. The line just turns up for all neighbors in the peer group.
Even when I reactive the capability with
neighbor fabric0 capability extended-nexthop
the problem persists after BGP is reset. Some combination of restarting FRR and changing configuration then fixes this again, but I can't make out a pattern.Output on the Arista side shows an error when BGP is established and also has routes with the wrong next-hop:
To Reproduce
Not sure how to reproduce, problem occurs pretty often, mostly after restarting FRR.
Expected behavior
When it works next-hop is IPv6 on the Arista side as expected:
Also in FRR the extended next-hop is advertised:
Versions
The text was updated successfully, but these errors were encountered: