-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'exit' in route-map section is causing frr reload and restart to fail in scale setups #15706
Comments
Also happened on my machine, using 2000++ lines config with lot of route-map. Temporary solved (no permanent solution) with increase timeout for watchfrr and ignore bgpd for stucking 100%. Changing with newer Ryzen cpu, vpp-dpdk (dedicated core for control plane) and now almost 4000 lines without crash or start-loop-reload again. |
I think this should be fixed in 10.0 in this commit: 2574f03. 10.0 should be out in a couple of days. If it helps, we can probably backport the fix into previous versions. |
10.0 fixes the startup issue but leaves the reload issue in play still. |
@idryzhov I ported this commit: 2574f03 and I do not see any difference in performance with or without it. With 600+ route-map, I did frr restart and I see bgpd taking 1 min 9 seconds to read the configuration. If I remove the exit's for route-map's, then frr restart takes only 1 sec.
Can you please test 2574f03 with 600+ route-map and try frr restart? |
If a command is not marked as `YANG`-converted, the current command batching buffer is flushed before executing the command. We shouldn't flush the buffer when executing an `exit` command. It should only be flushed if the next command is not `YANG`-converted, which is checked by the command itself, not the previous `exit`. Fixes FRRouting#15706. Signed-off-by: Igor Ryzhov <[email protected]>
I pushed a proper fix - #15770. Please check if it works for you. |
If a command is not marked as `YANG`-converted, the current command batching buffer is flushed before executing the command. We shouldn't flush the buffer when executing an `exit` command. It should only be flushed if the next command is not `YANG`-converted, which is checked by the command itself, not the previous `exit`. Fixes #15706. Signed-off-by: Igor Ryzhov <[email protected]> (cherry picked from commit 57811a5)
Description
Problem:
Every route-map section contains an exit in the config. In a scale setup where there are many route-maps, both frr reload and frr restart takes minutes to process all of them and bgpd process is busy 100% in vtysh_read and it fails.
RCA:
Issue # 1:
Initially frr.conf begins with multiple route-map's without any exit clause. when frr-reload is invoked it appends exits to every route-map section (Code changes are from this link) . This leads to northbound not being batched and hinders the libyang performance.
Issue # 2:
On a
show run
command the routemap_cli.c code would add in an exit() to everyroute-map…
cli section. Code changes are from this linkVersion
How to reproduce
Configure 500+ route-map from test_1 to test_500 like below
switch#
Expected behavior
When frr reload or restart is done on a config which has many route-map's it should converge soon without hogging the CPU.
Actual behavior
Both frr reload and restart takes minutes to converge and some time fails because bgpd process is hogging the CPU 100% of the time trying to process exit for each route-map in scale setup.
Additional context
No response
Checklist
The text was updated successfully, but these errors were encountered: