Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Neighsync] Handle Mac address 'none' during neigh add #2593

Merged
merged 1 commit into from
Jan 5, 2023

Conversation

prsunny
Copy link
Collaborator

@prsunny prsunny commented Dec 30, 2022

What I did
In some rare cases, kernel neighbor add notifications are received with Mac string as 'none'. This is from the following libnl code:

https://www.infradead.org/~tgr/libnl/doc/api/lib_2addr_8c_source.html

image

This can cause a neighsyncd crash as below from here

Dec 12 07:36:37.149555 ST12-M0 INFO swss#/supervisord: neighsyncd terminate called after throwing an instance of 'std::invalid_argument'
Dec 12 07:36:37.149555 ST12-01M0 INFO swss#/supervisord: neighsyncd   what():  can't parse mac address 'none'

Why I did it

Prevent crash by checking if MacString is 'none' during neighbor 'add' operation.

How I verified it

Details if related

@@ -143,6 +143,12 @@ void NeighSync::onMsg(int nlmsg_type, struct nl_object *obj)
nl_addr2str(rtnl_neigh_get_lladdr(neigh), macStr, MAX_ADDR_SIZE);
}

if (!delete_key && !strncmp(macStr, "none", MAX_ADDR_SIZE))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for the MAC to be set to "none" on a dual ToR device? If this happens and also triggers the zero MAC logic directly above, we could have the "none" MAC be overwritten by an empty MAC which would cause unexpected tunnel routes to be created.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about dualtor case and it is unlikely we hit this issue in a stable system. In case it happens, i think treating it as zero mac should be fine.

@weiliu-ivy
Copy link

@prsunny @theasianpianist could you help add the reason that this PR can bypass coverage failure? Thank you!

dprital added a commit to dprital/sonic-buildimage that referenced this pull request Jan 30, 2023
Update sonic-swss submodule pointer to include the following:
* a2a483d [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  ([sonic-net#2617](sonic-net/sonic-swss#2617))
* 9d1f66b [bfdorch] add local discriminator to state DB ([sonic-net#2629](sonic-net/sonic-swss#2629))
* c54b3d1 Vxlan tunnel endpoint custom monitoring APPL DB table. ([sonic-net#2589](sonic-net/sonic-swss#2589))
* 7f03db2 Fix potential risks ([sonic-net#2516](sonic-net/sonic-swss#2516))
* 383ee68 [refactor]Refactoring sai handle status ([sonic-net#2621](sonic-net/sonic-swss#2621))
* cd95972 Fix issue 13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL ([sonic-net#2619](sonic-net/sonic-swss#2619))
* a01470f Remove TODO comments that are no longer relevant ([sonic-net#2622](sonic-net/sonic-swss#2622))
* d058390 Changed the BFD default detect multiplier to 10x ([sonic-net#2614](sonic-net/sonic-swss#2614))
* d78b528 [MuxOrch] Enabling neighbor when adding in active state ([sonic-net#2601](sonic-net/sonic-swss#2601))
* 4ebdad1 [routesync] Fix for stale dynamic neighbor ([sonic-net#2553](sonic-net/sonic-swss#2553))
* 8857f92 Added new attributes for Vnet and Vxlan ecmp configurations. ([sonic-net#2584](sonic-net/sonic-swss#2584))
* b6bbc3e Revert [voq][chassis]Add show fabric counters port/queue commands (2522) ([sonic-net#2611](sonic-net/sonic-swss#2611))
* 52406e2 Add missing parameter to on_switch_shutdown_request method. ([sonic-net#2567](sonic-net/sonic-swss#2567))
* 4ac9ad9 Increase diff coverage to 80% ([sonic-net#2599](sonic-net/sonic-swss#2599))
* 8a0bb36 Handle Mac address 'none' ([sonic-net#2593](sonic-net/sonic-swss#2593))
* f496ab3 [vstest] Only collect stdout of orchagent_restart_check in vstest ([sonic-net#2597](sonic-net/sonic-swss#2597))
* 1dab495 Avoid aborting orchagent when setting TUNNEL attributes ([sonic-net#2591](sonic-net/sonic-swss#2591))
* 4395cea Fix neighbor doesn't update all attribute ([sonic-net#2577](sonic-net/sonic-swss#2577))

Signed-off-by: dprital <[email protected]>
@prsunny
Copy link
Collaborator Author

prsunny commented Jan 31, 2023

@prsunny @theasianpianist could you help add the reason that this PR can bypass coverage failure? Thank you!

hi @weiliu-ivy , this cannot be simulated by the test and hence bypassed. This can be treated as an exception.

@weiliu-ivy
Copy link

@prsunny @theasianpianist could you help add the reason that this PR can bypass coverage failure? Thank you!

hi @weiliu-ivy , this cannot be simulated by the test and hence bypassed. This can be treated as an exception.

@prsunny Got it. Thank you!

liat-grozovik pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Feb 1, 2023
Update sonic-swss submodule pointer to include the following:
* a2a483d [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  ([#2617](sonic-net/sonic-swss#2617))
* 9d1f66b [bfdorch] add local discriminator to state DB ([#2629](sonic-net/sonic-swss#2629))
* c54b3d1 Vxlan tunnel endpoint custom monitoring APPL DB table. ([#2589](sonic-net/sonic-swss#2589))
* 7f03db2 Fix potential risks ([#2516](sonic-net/sonic-swss#2516))
* 383ee68 [refactor]Refactoring sai handle status ([#2621](sonic-net/sonic-swss#2621))
* cd95972 Fix issue 13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL ([#2619](sonic-net/sonic-swss#2619))
* a01470f Remove TODO comments that are no longer relevant ([#2622](sonic-net/sonic-swss#2622))
* d058390 Changed the BFD default detect multiplier to 10x ([#2614](sonic-net/sonic-swss#2614))
* d78b528 [MuxOrch] Enabling neighbor when adding in active state ([#2601](sonic-net/sonic-swss#2601))
* 4ebdad1 [routesync] Fix for stale dynamic neighbor ([#2553](sonic-net/sonic-swss#2553))
* 8857f92 Added new attributes for Vnet and Vxlan ecmp configurations. ([#2584](sonic-net/sonic-swss#2584))
* b6bbc3e Revert [voq][chassis]Add show fabric counters port/queue commands (2522) ([#2611](sonic-net/sonic-swss#2611))
* 52406e2 Add missing parameter to on_switch_shutdown_request method. ([#2567](sonic-net/sonic-swss#2567))
* 4ac9ad9 Increase diff coverage to 80% ([#2599](sonic-net/sonic-swss#2599))
* 8a0bb36 Handle Mac address 'none' ([#2593](sonic-net/sonic-swss#2593))
* f496ab3 [vstest] Only collect stdout of orchagent_restart_check in vstest ([#2597](sonic-net/sonic-swss#2597))
* 1dab495 Avoid aborting orchagent when setting TUNNEL attributes ([#2591](sonic-net/sonic-swss#2591))
* 4395cea Fix neighbor doesn't update all attribute ([#2577](sonic-net/sonic-swss#2577))

Signed-off-by: dprital <[email protected]>
prsunny pushed a commit that referenced this pull request Feb 16, 2023
*Merge remote-tracking branch 'upstream/master' into dash (#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (#2591)
* Handle Mac address 'none' (#2593)
* Increase diff coverage to 80% (#2599)
* Add missing parameter to on_switch_shutdown_request method. (#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (#2522)" (#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (#2553)
* [MuxOrch] Enabling neighbor when adding in active state (#2601)
* Changed the BFD default detect multiplier to 10x (#2614)
* Remove TODO comments that are no longer relevant (#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (#2619)
* [refactor]Refactoring sai handle status (#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (#2617)
* [voq][chassis] Remove created ports from the default vlan. (#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (#2638)
* [hash]: Add UT infra. (#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (#2644)
* [ResponsePublisher] add pipeline support  (#2511)
* [dash] Fix compilation issue caused by missing include.
theasianpianist pushed a commit to theasianpianist/sonic-swss that referenced this pull request Jul 21, 2023
)

*Merge remote-tracking branch 'upstream/master' into dash (sonic-net#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (sonic-net#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (sonic-net#2591)
* Handle Mac address 'none' (sonic-net#2593)
* Increase diff coverage to 80% (sonic-net#2599)
* Add missing parameter to on_switch_shutdown_request method. (sonic-net#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (sonic-net#2522)" (sonic-net#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (sonic-net#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (sonic-net#2553)
* [MuxOrch] Enabling neighbor when adding in active state (sonic-net#2601)
* Changed the BFD default detect multiplier to 10x (sonic-net#2614)
* Remove TODO comments that are no longer relevant (sonic-net#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (sonic-net#2619)
* [refactor]Refactoring sai handle status (sonic-net#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (sonic-net#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (sonic-net#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (sonic-net#2617)
* [voq][chassis] Remove created ports from the default vlan. (sonic-net#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (sonic-net#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (sonic-net#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (sonic-net#2638)
* [hash]: Add UT infra. (sonic-net#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (sonic-net#2644)
* [ResponsePublisher] add pipeline support  (sonic-net#2511)
* [dash] Fix compilation issue caused by missing include.
theasianpianist pushed a commit to theasianpianist/sonic-swss that referenced this pull request Jul 25, 2023
)

*Merge remote-tracking branch 'upstream/master' into dash (sonic-net#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (sonic-net#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (sonic-net#2591)
* Handle Mac address 'none' (sonic-net#2593)
* Increase diff coverage to 80% (sonic-net#2599)
* Add missing parameter to on_switch_shutdown_request method. (sonic-net#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (sonic-net#2522)" (sonic-net#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (sonic-net#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (sonic-net#2553)
* [MuxOrch] Enabling neighbor when adding in active state (sonic-net#2601)
* Changed the BFD default detect multiplier to 10x (sonic-net#2614)
* Remove TODO comments that are no longer relevant (sonic-net#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (sonic-net#2619)
* [refactor]Refactoring sai handle status (sonic-net#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (sonic-net#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (sonic-net#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (sonic-net#2617)
* [voq][chassis] Remove created ports from the default vlan. (sonic-net#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (sonic-net#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (sonic-net#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (sonic-net#2638)
* [hash]: Add UT infra. (sonic-net#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (sonic-net#2644)
* [ResponsePublisher] add pipeline support  (sonic-net#2511)
* [dash] Fix compilation issue caused by missing include.
theasianpianist pushed a commit to theasianpianist/sonic-swss that referenced this pull request Jul 25, 2023
)

*Merge remote-tracking branch 'upstream/master' into dash (sonic-net#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (sonic-net#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (sonic-net#2591)
* Handle Mac address 'none' (sonic-net#2593)
* Increase diff coverage to 80% (sonic-net#2599)
* Add missing parameter to on_switch_shutdown_request method. (sonic-net#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (sonic-net#2522)" (sonic-net#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (sonic-net#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (sonic-net#2553)
* [MuxOrch] Enabling neighbor when adding in active state (sonic-net#2601)
* Changed the BFD default detect multiplier to 10x (sonic-net#2614)
* Remove TODO comments that are no longer relevant (sonic-net#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (sonic-net#2619)
* [refactor]Refactoring sai handle status (sonic-net#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (sonic-net#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (sonic-net#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (sonic-net#2617)
* [voq][chassis] Remove created ports from the default vlan. (sonic-net#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (sonic-net#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (sonic-net#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (sonic-net#2638)
* [hash]: Add UT infra. (sonic-net#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (sonic-net#2644)
* [ResponsePublisher] add pipeline support  (sonic-net#2511)
* [dash] Fix compilation issue caused by missing include.
yxieca pushed a commit that referenced this pull request Nov 6, 2023
In some rare cases, kernel neighbor add notifications are received with Mac string as 'none'. Fixing this to prevent crash.
yxieca pushed a commit that referenced this pull request Nov 6, 2023
In some rare cases, kernel neighbor add notifications are received with Mac string as 'none'. Fixing this to prevent crash.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants