-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orchagent crash on latest SONiC images #458
Comments
@lguohan Here is the issue that I mentioned to you on last Friday. Please let me know if any further details are needed. |
@zhenggen-xu @stcheng @lguohan @wendani @kcudnik We have narrowed down the problem to a specific commit. Issue observed from below sonic-buildimage commit id onwards:
The corresponding sonic-swss commit is: commit 5be3963793d5d04807931f016faf1fcca87f6286 We took one commit before the above mentioned sonic-buildimage commit id and tested. we didn't observe orchagent crash issue and all the docker containers are running.
In between the sonic-swss has about 10 commits. Please look in to this issue and let us know if you need any further details. |
Seems like you want to remove port and vendor Sai returns bot supported |
@kcudnik There aren't any changes to the configuration. No cable OIR. Issue is happening even without the cables are connected. This issue is reported by Dell also: sonic-net/sonic-buildimage#3314 Any broadcom based switch will hit this problem. |
i just concluded that from syslog you pasted:
error is "not supported" on action "remove key: sai object type port" so someone wants to re remove PORT object, and brcm SAI dont support that operation |
@kcudnik Would you know how to debug this issue? |
Can you please share your bcm config file? |
We faced the same problem on our Inventec switches, the problem can be resolved in two ways:
commit 6f40933d3d7b9f21a97de275fbd14ea3598d9a0a I need to check with @ zhenggen-xu [email protected] for more details. |
@habeebmohammed , as suggested by you, we commented out the loopback and mgmt ports in bcm config. We are not seeing the issue now. There are no cores and all the docker containers are running. #add loopback port But ideally with the commit([Feature: DynamicPortBreakout]) this should not have introduced this issue. We still don't know the implications of commenting these lines in bcm config. |
from sairedis here there is nothing to debug here, user wants to remove port, but vendor sai don't implement that feature, so there is nothing you can do about it |
@kcudnik As I said earlier, there is not any user action done. Also note that after the commit 6f40933d3d7b9f21a97de275fbd14ea3598d9a0a, this problem started to appear. We can go back to the very previous commit and there is not any issue seen there. |
By user i mean OA, and log you pasted shows that port is being removed |
Orchagent is crashing on the latest SONiC images. Till August 2'nd, there were no issues. This is the last commit 'c6e442b946d7bb46d7e53d3ce1263d44b0ef3810
' on which things were fine.
Here are a few logs which will help to debug the problem.
admin@sonic:~$ show version
SONiC Software Version: SONiC.master.0-dirty-20190829.210940
Distribution: Debian 9.9
Kernel: 4.9.0-9-2-amd64
Build commit: 3323e9b8
Build date: Thu Aug 29 17:57:57 UTC 2019
Built by: bala@vlinux-5
Platform: x86_64-juniper_qfx5210-r0
HwSKU: Juniper-QFX5210-64C
ASIC: broadcom
Serial Number: YB0217500013
Uptime: 17:00:35 up 3 min, 2 users, load average: 0.62, 0.36, 0.14
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-brcm latest 6c5c47f159ff 392MB
docker-syncd-brcm master.0-dirty-20190829.210940 6c5c47f159ff 392MB
docker-lldp-sv2 latest 57dfcda211c2 298MB
docker-lldp-sv2 master.0-dirty-20190829.210940 57dfcda211c2 298MB
docker-snmp-sv2 latest 8eac3da58656 323MB
docker-snmp-sv2 master.0-dirty-20190829.210940 8eac3da58656 323MB
docker-dhcp-relay latest 27f3f670833a 289MB
docker-dhcp-relay master.0-dirty-20190829.210940 27f3f670833a 289MB
docker-database latest 87420c49d8e3 281MB
docker-database master.0-dirty-20190829.210940 87420c49d8e3 281MB
docker-teamd latest 10693a6d0f14 302MB
docker-teamd master.0-dirty-20190829.210940 10693a6d0f14 302MB
docker-orchagent latest 366c48a62e70 321MB
docker-orchagent master.0-dirty-20190829.210940 366c48a62e70 321MB
docker-fpm-frr latest 7ce2e1efeebd 319MB
docker-fpm-frr master.0-dirty-20190829.210940 7ce2e1efeebd 319MB
docker-sonic-telemetry latest 8b3b68beed1f 304MB
docker-sonic-telemetry master.0-dirty-20190829.210940 8b3b68beed1f 304MB
docker-router-advertiser latest ea0b1d0ddd01 281MB
docker-router-advertiser master.0-dirty-20190829.210940 ea0b1d0ddd01 281MB
docker-platform-monitor latest 5038208af485 326MB
docker-platform-monitor master.0-dirty-20190829.210940 5038208af485 326MB
admin@sonic:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4a2e0dbd8b38 docker-snmp-sv2:latest "/usr/bin/supervisord" 2 minutes ago Up 30 seconds snmp
28c26910047d docker-fpm-frr:latest "/usr/bin/supervisord" 3 minutes ago Up 3 minutes bgp
8c10dab37f55 docker-lldp-sv2:latest "/usr/bin/supervisord" 3 minutes ago Up 3 minutes lldp
ab7c2cedf149 docker-platform-monitor:latest "/usr/bin/docker_ini…" 3 minutes ago Up 3 minutes pmon
591487be163e docker-sonic-telemetry:latest "/usr/bin/supervisord" 3 minutes ago Up 3 minutes telemetry
eff380465b1b docker-database:latest "/usr/local/bin/dock…" 3 minutes ago Up 3 minutes database
Seeing the following message in syslog:
Sep 3 17:00:51.203721 sonic NOTICE swss#orchagent: :- initializePort: Initializing port alias:Ethernet208 pid:1000000000002
Sep 3 17:00:51.204278 sonic ERR syncd#syncd: :- processEvent: failed to execute api: remove, key: SAI_OBJECT_TYPE_PORT:oid:0x1000000000022, status: SAI_STATUS_NOT_SUPPORTED
Sep 3 17:00:51.204363 sonic ERR syncd#syncd: :- syncd_main: Runtime error: :- processEvent: failed to execute api: remove, key: SAI_OBJECT_TYPE_PORT:oid:0x1000000000022, status: SAI_STATUS_NOT_SUPPORTED
Sep 3 17:00:51.204387 sonic NOTICE syncd#syncd: :- notify_OA_about_syncd_exception: sending switch_shutdown_request notification to OA
Sep 3 17:00:51.204433 sonic NOTICE syncd#syncd: :- notify_OA_about_syncd_exception: notification send successfull
Sep 3 17:00:51.204527 sonic NOTICE swss#orchagent: :- handle_switch_shutdown_request: switch shutdown request
Sep 3 17:00:51.204865 sonic INFO swss#supervisord: orchagent terminate called after throwing an instance of 'std::invalid_argument'
Sep 3 17:00:51.204865 sonic INFO swss#supervisord: orchagent what(): parse error - unexpected end of input
Sep 3 17:00:51.339267 sonic INFO swss#supervisor-proc-exit-listener: Process orchagent exited unxepectedly. Terminating supervisor...
sonic_dump_sonic_20190903_170341.tar.gz
The text was updated successfully, but these errors were encountered: