Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Multi NPU] Time Improvements to the config reload/load_minigraph commands #917

Merged
merged 6 commits into from
Jul 9, 2020

Conversation

judyjoseph
Copy link
Contributor

@judyjoseph judyjoseph commented May 18, 2020

- What I did
config reload and config load_minigraph commands were taking ~ 5m in in the multi ASIC platforms.Reduced the total time taken to ~ 1min by parallelizing with python threads.

- How I did it
Started a python thread per NPU to stop/restart the services.

Currently the threads are started at the same time for all the ASIC's irrespective of whether they are front-end OR back-end ASIC's. I found it ok as we finish the entire activity of config reload in 1 min and when we do config reload usually the device should not be in production and not carrying active traffic.

- How to verify it
Verified on a Single ASIC and multi AISC devices. Attaching sample output.

MULTI ASIC

~$ time sudo config reload -y
Executing stop of service swss@1...
Executing stop of service swss@0...
Executing stop of service swss@2...
Executing stop of service swss@3...
Executing stop of service swss@4...
Executing stop of service swss@5...
Executing stop of service lldp...
Executing stop of service lldp@1...
Executing stop of service lldp@0...
Executing stop of service lldp@2...
Executing stop of service lldp@3...
Executing stop of service lldp@4...
Executing stop of service lldp@5...
Executing stop of service pmon...
Executing stop of service bgp@0...
Executing stop of service bgp@1...
Executing stop of service bgp@2...
Executing stop of service bgp@3...
Executing stop of service bgp@4...
Executing stop of service bgp@5...
Executing stop of service hostcfgd...
Running command: /usr/local/bin/sonic-cfggen -j /etc/sonic/init_cfg.json -j /etc/sonic/config_db.json --write-to-db
Running command: /usr/bin/db_migrator.py -o migrate
Running command: /usr/local/bin/sonic-cfggen -j /etc/sonic/init_cfg.json -j /etc/sonic/config_db0.json -n asic0 --write-to-db
Running command: /usr/bin/db_migrator.py -o migrate -n asic0
Running command: /usr/local/bin/sonic-cfggen -j /etc/sonic/init_cfg.json -j /etc/sonic/config_db1.json -n asic1 --write-to-db
Running command: /usr/bin/db_migrator.py -o migrate -n asic1
Running command: /usr/local/bin/sonic-cfggen -j /etc/sonic/init_cfg.json -j /etc/sonic/config_db2.json -n asic2 --write-to-db
Running command: /usr/bin/db_migrator.py -o migrate -n asic2
Running command: /usr/local/bin/sonic-cfggen -j /etc/sonic/init_cfg.json -j /etc/sonic/config_db3.json -n asic3 --write-to-db
Running command: /usr/bin/db_migrator.py -o migrate -n asic3
Running command: /usr/local/bin/sonic-cfggen -j /etc/sonic/init_cfg.json -j /etc/sonic/config_db4.json -n asic4 --write-to-db
Running command: /usr/bin/db_migrator.py -o migrate -n asic4
Running command: /usr/local/bin/sonic-cfggen -j /etc/sonic/init_cfg.json -j /etc/sonic/config_db5.json -n asic5 --write-to-db
Running command: /usr/bin/db_migrator.py -o migrate -n asic5
Executing reset-failed of service bgp@0...
Executing reset-failed of service bgp@1...
Executing reset-failed of service bgp@2...
Executing reset-failed of service bgp@3...
Executing reset-failed of service bgp@4...
Executing reset-failed of service bgp@5...
Executing reset-failed of service dhcp_relay...
Executing reset-failed of service hostcfgd...
Executing reset-failed of service hostname-config...
Executing reset-failed of service interfaces-config...
Executing reset-failed of service lldp...
Executing reset-failed of service lldp@0...
Executing reset-failed of service lldp@1...
Executing reset-failed of service lldp@2...
Executing reset-failed of service lldp@3...
Executing reset-failed of service lldp@4...
Executing reset-failed of service lldp@5...
Executing reset-failed of service ntp-config...
Executing reset-failed of service pmon...
Executing reset-failed of service radv...
Executing reset-failed of service rsyslog-config...
Executing reset-failed of service snmp...
Executing reset-failed of service swss@1...
Executing reset-failed of service swss@0...
Executing reset-failed of service swss@2...
Executing reset-failed of service swss@3...
Executing reset-failed of service swss@4...
Executing reset-failed of service swss@5...
Executing reset-failed of service syncd@0...
Executing reset-failed of service syncd@1...
Executing reset-failed of service syncd@2...
Executing reset-failed of service syncd@3...
Executing reset-failed of service syncd@4...
Executing reset-failed of service syncd@5...
Executing reset-failed of service teamd@0...
Executing reset-failed of service teamd@1...
Executing reset-failed of service teamd@2...
Executing reset-failed of service teamd@3...
Executing reset-failed of service teamd@4...
Executing reset-failed of service teamd@5...
Executing restart of service hostname-config...
Executing restart of service interfaces-config...
Executing restart of service ntp-config...
Executing restart of service rsyslog-config...
Executing restart of service swss@0...
Executing restart of service swss@1...
Executing restart of service swss@2...
Executing restart of service swss@3...
Executing restart of service swss@4...
Executing restart of service swss@5...
Executing restart of service bgp@0...
Executing restart of service bgp@1...
Executing restart of service bgp@2...
Executing restart of service bgp@3...
Executing restart of service bgp@4...
Executing restart of service bgp@5...
Executing restart of service pmon...
Executing restart of service lldp...
Executing restart of service lldp@0...
Executing restart of service lldp@1...
Executing restart of service lldp@2...
Executing restart of service lldp@3...
Executing restart of service lldp@4...
Executing restart of service lldp@5...
Executing restart of service hostcfgd...

real 1m55.971s
user 0m3.652s
sys 0m1.232s

- Previous command output (if the output of a command-line utility has changed)

- New command output (if the output of a command-line utility has changed)

@judyjoseph judyjoseph changed the title [Multi NPU] Improvements to the config reload/load_minigraph commands [Multi NPU] Time Improvements to the config reload/load_minigraph commands May 18, 2020
abdosi
abdosi previously approved these changes May 18, 2020
@judyjoseph judyjoseph marked this pull request as ready for review May 27, 2020 01:23
@judyjoseph judyjoseph requested a review from jleveque May 27, 2020 01:23
config/main.py Outdated Show resolved Hide resolved
config/main.py Outdated Show resolved Hide resolved
…list, we were changing the

order in which the services were stopped/started. Now changed the logic to directly
pass the service to the thread handler.
@judyjoseph
Copy link
Contributor Author

judyjoseph commented Jun 16, 2020

retest this please

2 similar comments
@judyjoseph
Copy link
Contributor Author

retest this please

@judyjoseph
Copy link
Contributor Author

retest this please

config/main.py Outdated Show resolved Hide resolved
@judyjoseph judyjoseph requested a review from jleveque June 18, 2020 05:20
@jleveque
Copy link
Contributor

Retest this please

@judyjoseph
Copy link
Contributor Author

retest this please

3 similar comments
@judyjoseph
Copy link
Contributor Author

retest this please

@judyjoseph
Copy link
Contributor Author

retest this please

@judyjoseph
Copy link
Contributor Author

retest this please

@judyjoseph
Copy link
Contributor Author

@jleveque - can you check this PR , this was blocked for a while due to test failures -- the tests are all passing now

@judyjoseph judyjoseph merged commit 97d813b into sonic-net:master Jul 9, 2020
abdosi pushed a commit that referenced this pull request Jul 11, 2020
…mands (#917)

* Improvements to the config reload/load_minigraph commands in Multi NPU platforms by parallelizing with threads.
@judyjoseph judyjoseph deleted the cfg_reload branch July 24, 2020 21:38
abdosi pushed a commit to abdosi/sonic-utilities that referenced this pull request Aug 4, 2020
…mands (sonic-net#917)

* Improvements to the config reload/load_minigraph commands in Multi NPU platforms by parallelizing with threads.
stepanblyschak pushed a commit to stepanblyschak/sonic-utilities that referenced this pull request Apr 28, 2022
Intf table migration for APP_DB entries during warmboot (sonic-net#980)
[Multi NPU] Time Improvements to the config reload/load_minigraph
commands  (sonic-net#917)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants