Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(frrcfgd): update for new locator config way #21843

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 37 additions & 55 deletions src/sonic-frr-mgmt-framework/frrcfgd/frrcfgd.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def extract_cmd_daemons(cmd_str):
class BgpdClientMgr(threading.Thread):
VTYSH_MARK = 'vtysh '
PROXY_SERVER_ADDR = '/etc/frr/bgpd_client_sock'
ALL_DAEMONS = ['bgpd', 'zebra', 'staticd', 'bfdd', 'ospfd', 'pimd']
ALL_DAEMONS = ['bgpd', 'zebra', 'staticd', 'bfdd', 'ospfd', 'pimd', 'mgmtd']
TABLE_DAEMON = {
'DEVICE_METADATA': ['bgpd'],
'BGP_GLOBALS': ['bgpd'],
Expand Down Expand Up @@ -118,7 +118,9 @@ class BgpdClientMgr(threading.Thread):
'PIM_INTERFACE': ['pimd'],
'IGMP_INTERFACE': ['pimd'],
'IGMP_INTERFACE_QUERY': ['pimd'],
'SRV6_LOCATOR': ['zebra']
'SRV6_MY_LOCATORS': ['zebra'],
'SRV6_MY_SIDS': ['mgmtd']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the damemon corresponding to MY_SIDS table is mgmtd? I thought that the static-sids command is handled by staticd?

Copy link
Contributor Author

@LARLSN LARLSN Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the frr code, mgmtd calls the relevant vtysh cli func defined by staticd, so it's still staticd to handle the command

image

image

Since we use unified config mode for FRR, https://github.com/sonic-net/sonic-mgmt/blob/master/ansible/vars/configdb_jsons/7nodes_cisco/PE1.json#L130, this leads to use mgmtd as the entry point for FRR configurations.

@venkatmahalingam hi, could you please help to confirm this understand ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using client socket, we'll send the configs to corresponding FRR daemons, why would we send all FRR configurations to mgmtd, any new changes that I'm not aware of?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@venkatmahalingam we are using "docker_routing_config_mode": "unified". In my understanding, since it is unified mode, the configuration would go via mgmtd, then distribute to different protocol process. Is that the correct understanding?

I saw mgmtd https://docs.frrouting.org/en/latest/mgmtd.html

"The FRR Management Daemon (from now on referred to as MGMTd) is a new centralized entity representing the FRR Management Plane which can take management requests from any kind of UI/Frontend entity (e.g. CLI, Netconf, Restconf, Grpc etc.) over a new unified and common Frontend interface "

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we send the configs to FRR daemons via client socket, executing the "subprocess.Popen(command, shell=True, stdout=subprocess.PIPE)" for each command causing delays in case of scaled FRR configs.

if not bgpd_client.run_vtysh_command(table, command, daemons) and not ignore_fail:

As much as possible we can use the FRR daemons's client socket directly, if this is restricted by FRR or mgmtd provides better configuration handling, we could migrate to use mgmtd for all commands, no need to map configs to FRR daemons.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can gradually migrate configs associated with staticd daemon to mgmtd. Starting with SRv6 now, then static routes, and so on. @venkatmahalingam @eddieruan-alibaba what do you think?

Copy link
Collaborator

@venkatmahalingam venkatmahalingam Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been hearing YANG validation issues that caused config failure with bgpcfgd because of mgmtd, we have no urgency to use mgmtd for frrcfgd, we can wait for mgmtd's stability and access the importance of adding extra hop (mgmtd) to config FRR daemons, I believe, direct client socket method to configure FRR deamon would continue to be there. @LARLSN If SRV6_MY_SIDS is working with staticd directly, please make change to reflect it in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahsalam What are the current configs using mgmtd? Could you please provide some details

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there is a strong reason for us to go mgmtd (earlier it was vtysh that handles all config), we can continue to use individual client socket methods, my question was simple, should migrate all configs associated with staticd daemon to mgmtd now? wondering if anyone attempted to check static route config in SONiC after new FRR changes (with mgmtd) @eddieruan-alibaba anyone tried from Alibaba? @LARLSN did we try staticd for SRV6_MY_SIDS table configs?

We started with staticd, it doesn't work since frr has moved to use mgmtd for staticd.

Copy link
Contributor Author

@LARLSN LARLSN Mar 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@venkatmahalingam As eddie said,We firstly used staticd,which can not handle this command. This is the shot of frrcfgd log file when used staticd
image


}
VTYSH_CMD_DAEMON = [(r'show (ip|ipv6) route($|\s+\S+)', ['zebra']),
(r'show ip mroute($|\s+\S+)', ['pimd']),
Expand Down Expand Up @@ -402,46 +404,6 @@ def get_command_cmn(daemon, cmd_str, op, st_idx, vals, bool_values):
cmd_args.append(CommandArgument(daemon, cmd_enable, vals[idx]))
return [cmd_str.format(*cmd_args, no = CommandArgument(daemon, cmd_enable))]

def hdl_srv6_locator(daemon, cmd_str, op, st_idx, vals, bool_values):
chk_val = None
if op == CachedDataWithOp.OP_DELETE:
if bool_values is not None and len(bool_values) >= 3:
# set to default if given
cmd_enable = bool_values[2]
else:
cmd_enable = False
else:
cmd_enable = True
if bool_values is not None:
if len(vals) <= st_idx:
return None
chk_val = vals[st_idx]
if type(chk_val) is dict:
cmd_enable = False
for _, v in chk_val.items():
if not v[1]:
continue
if v[0] == bool_values[0]:
cmd_enable = True
break
else:
if chk_val == bool_values[0]:
cmd_enable = True
elif chk_val == bool_values[1]:
cmd_enable = False
else:
return None
else:
cmd_enable = True
cmd_list = []
for num in range(len(vals[0])):
cmd_args = []
for idx in range(len(vals)):
if bool_values is not None and idx == st_idx:
continue
cmd_args.append(CommandArgument(daemon, cmd_enable, vals[idx][num]))
cmd_list.append(cmd_str.format(*cmd_args, no = CommandArgument(daemon, cmd_enable)))
return cmd_list

def hdl_set_extcomm(daemon, cmd_str, op, st_idx, args, is_inline):
if is_inline:
Expand Down Expand Up @@ -2079,7 +2041,6 @@ class BGPConfigDaemon:
('icmo_ttl', 'ttl {}', handle_ip_sla_common),
('icmp_tos', 'tos {}', handle_ip_sla_common),
]
srv6_locator_key_map = [(['opcode_prefix', 'opcode_act', 'opcode_data'], '{no:no-prefix}opcode {} {} {}', hdl_srv6_locator)]


tbl_to_key_map = {'BGP_GLOBALS': global_key_map,
Expand Down Expand Up @@ -2110,7 +2071,6 @@ class BGPConfigDaemon:
'PIM_INTERFACE': pim_interface_key_map,
'IGMP_INTERFACE': igmp_mcast_grp_key_map,
'IGMP_INTERFACE_QUERY': igmp_interface_config_key_map,
'SRV6_LOCATOR': srv6_locator_key_map,
}

vrf_tables = {'BGP_GLOBALS', 'BGP_GLOBALS_AF',
Expand Down Expand Up @@ -2308,7 +2268,8 @@ def __init__(self):
('PIM_INTERFACE', self.bgp_table_handler_common),
('IGMP_INTERFACE', self.bgp_table_handler_common),
('IGMP_INTERFACE_QUERY', self.bgp_table_handler_common),
('SRV6_LOCATOR', self.bgp_table_handler_common),
('SRV6_MY_LOCATORS', self.bgp_table_handler_common),
('SRV6_MY_SIDS', self.bgp_table_handler_common),
]
self.bgp_message = queue.Queue(0)
self.table_data_cache = self.config_db.get_table_data([tbl for tbl, _ in self.table_handler_list])
Expand Down Expand Up @@ -2696,17 +2657,38 @@ def __update_bgp(self, data_list):
self.bgp_confed_peers[vrf] = copy.copy(self.upd_confed_peers)
else:
self.__delete_vrf_asn(vrf, table, data)
elif table == 'SRV6_LOCATOR':
key = prefix
prefix = data['prefix']
cmd_prefix = ['configure terminal', 'segment-routing', 'srv6', 'locators',
'locator {}'.format(key),
'prefix {} block-len {} node-len {} func-bits {}'.format(prefix.data, data['block_len'].data, data['node_len'].data, data['func_len'].data)]

if not key_map.run_command(self, table, data, cmd_prefix):
syslog.syslog(syslog.LOG_ERR, 'failed running SRV6 LOCATOR config command')
elif table == 'SRV6_MY_LOCATORS':
if key is None:
syslog.syslog(syslog.LOG_ERR, 'invalid key for SRV6_MY_LOCATORS table')
continue
if not del_table:
key = prefix
prefix = data['prefix']
cmd = "vtysh -c 'configure terminal' -c 'segment-routing' -c 'srv6' -c 'locators' "
cmd += " -c 'locator {}' ".format(key)
cmd += " -c 'prefix {} block-len {} node-len {} func-bits {}' ".format(prefix.data, data['block_len'].data, data['node_len'].data, data['func_len'].data)
if not self.__run_command(table, cmd):
syslog.syslog(syslog.LOG_ERR, 'failed running SRV6 POLICY config command')
continue
elif table == 'SRV6_MY_SIDS':
if key is None:
syslog.syslog(syslog.LOG_ERR, 'invalid key for SRV6_MY_SIDS table')
continue

if not del_table:
cmd = "vtysh -c 'configure terminal' -c 'segment-routing' -c 'srv6' "
cmd +="-c 'static-sids' "
uDTAction = ["uDT46", "uDT4", "uDT6"]
if data['action'].data in uDTAction:
cmd +="-c 'sid {} locator {} behavior {} vrf {}' ".format(key, prefix, data['action'].data, data['decap_vrf'].data)
elif data['action'].data == 'uN':
cmd +="-c 'sid {} locator {} behavior {} ' ".format(key, prefix, data['action'].data)
else:
syslog.syslog(syslog.LOG_ERR, 'failed running SRV6 POLICY config command, not support action %s'.format(data['action'].data))
continue
if not self.__run_command(table, cmd):
syslog.syslog(syslog.LOG_ERR, 'failed running SRV6 SRV6_MY_SIDS config command')
continue

elif table == 'BGP_GLOBALS_AF':
af, ip_type = key.lower().split('_')
#this is to temporarily make table cache key accessible to key_map handler function
Expand Down
Loading