Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve NetworkManager's device management/ignore logic, using udev matching rules (LP: #1951653) #276

Merged
merged 6 commits into from
May 30, 2022

Conversation

slyon
Copy link
Collaborator

@slyon slyon commented May 3, 2022

Description

By default, systemd-networkd controls only interfaces that it has explicitly been configured for (via a sd-networkd .network file); it considers other interfaces as "unmanaged" and ignores them. NetworkManager OTOH tries to claim control of any interface that is UP, which might lead to conflicts if multiple networking daemons are used at the same time.

NetworkManager has different ways to control/avoid this behavior:

  • NM_UNMANAGED=1 udev environment, the weakest condition
  • NetworkManager.conf [keyfile].unmanaged-devices= setting (affected by some unexpected behavior: LP#1615044)
  • NetworkManager.conf [device].managed= setting, the strongest condition, overruling all other settings, created a runtime in /run/NetworkManager/devices/*

The stronger rules via NetworkManager.conf are unfortunately a bit limited in their matching logic, as multiple checks can only be ORed (e.g. we cannot check for: interface-name=wl* AND driver=(virt* OR iwlwifi)). udev OTOH has a very powerful matching logic (but unfortunately cannot match on the interface type: ethernet/wifi/bridge/...).

So we're reimplementing/refactoring the flawed NetworkManager "unmanaged vs managed" logic in netplan, by having an explicit allow-list of interfaces that NM is supposed to control (as some udev rules shipped system-wide would otherwise disable certain interfaces in containers and other special conditions, even if configured via netplan, c.f. /lib/udev/rules.d/85-nm-unmanaged.rules) in addition to an explicit deny-list of interfaces to ignore (e.g. because they are to be managed by sd-networkd) as udev rules in /run/udev/rules.d/90-netplan.rules and keeping the special-case of matching a whole class/type of interfaces (e.g. ethernets/wifis/bridges/...) in /run/NetworkManager/conf.d/netplan.conf as a NetworkManager.conf [device*].managed= rule.
Furthermore, we're applying this matching logic whenever ANY interface (in netplan YAML) is supposed to be controlled by NetworkManager (that is in contrast to only applying it when NetworkManager is selected as the global renderer).

It is worth mentioning that this "managed vs unmanaged" matching logic is slightly different from netplan's usual matching logic (e.g. for configuring the interfaces), as we might ignore an interface based on it's old AND new/renamed interface name, or it's old and new/changed MAC address, while we cannot do this for the normal netplan matching.

Commits:

  • generate: ignore 10-globally-managed-devices.conf if any NM config is given in netplan
  • cli: re-apply udev rules (NM_UNMANAGED) if needed
    • Also, make sure we clear any runtime state that NetworkManager left in /run/NetworkManager/devices/* as this could overwrite our udev rules.
  • nm: extend type_str() to return passthrough types
  • nm: tests: improve NM manage/ignore logic, using udev matching rules
  • tests:base: cleanup udev quirks
    • We have an explicit NM-managed-devices allow-list, now. So we don't need those quirks anymore.

Checklist

  • Runs make check successfully.
  • Retains 100% code coverage (make check-coverage).
  • New/changed keys in YAML format are documented.
  • (Optional) Adds example YAML for new feature.
  • (Optional) Closes an open bug in Launchpad. LP#1951653

@slyon slyon force-pushed the slyon/nm-accept-deny-list-lp1951653-FR-1888 branch 2 times, most recently from ab172ec to 84fdc6f Compare May 3, 2022 15:33
@slyon slyon force-pushed the slyon/nm-accept-deny-list-lp1951653-FR-1888 branch 2 times, most recently from 1e0b4f7 to de23de1 Compare May 4, 2022 12:12
slyon added 4 commits May 12, 2022 11:18
Also, make sure we clear any runtime state that NetworkManager left in
/run/NetworkManager/devices/* as this could overwrite our udev rules.
V2: avoid g_string_replace() which is not available in older verions of GLib (Focal)
@slyon slyon force-pushed the slyon/nm-accept-deny-list-lp1951653-FR-1888 branch from de23de1 to 4c16a06 Compare May 12, 2022 09:18
We have an explicit NM-managed-devices allow-list, now.
So we don't need those quirks anymore.
@slyon slyon force-pushed the slyon/nm-accept-deny-list-lp1951653-FR-1888 branch from 7b5504b to dbf7099 Compare May 12, 2022 12:38
@codecov-commenter
Copy link

codecov-commenter commented May 12, 2022

Codecov Report

Merging #276 (c03d383) into main (dab082f) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #276   +/-   ##
=======================================
  Coverage   99.07%   99.08%           
=======================================
  Files          61       61           
  Lines       11194    11226   +32     
=======================================
+ Hits        11091    11123   +32     
  Misses        103      103           
Impacted Files Coverage Δ
netplan/cli/commands/apply.py 100.00% <ø> (ø)
tests/generator/test_passthrough.py 100.00% <ø> (ø)
src/generate.c 100.00% <100.00%> (ø)
src/nm.c 100.00% <100.00%> (ø)
tests/generator/base.py 100.00% <100.00%> (ø)
tests/generator/test_auth.py 100.00% <100.00%> (ø)
tests/generator/test_bonds.py 100.00% <100.00%> (ø)
tests/generator/test_bridges.py 100.00% <100.00%> (ø)
tests/generator/test_common.py 100.00% <100.00%> (ø)
tests/generator/test_ethernets.py 100.00% <100.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dab082f...c03d383. Read the comment docs.

@slyon slyon force-pushed the slyon/nm-accept-deny-list-lp1951653-FR-1888 branch from dbf7099 to f291f83 Compare May 12, 2022 13:59
@slyon slyon marked this pull request as ready for review May 12, 2022 14:56
@slyon slyon requested a review from schopin-pro May 19, 2022 13:02
Copy link
Contributor

@schopin-pro schopin-pro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Only one small optional stylistic nitpick, feel free to merge if you disagree!

Small question though, where does that last commit come from? Has this bug surfaced due to the NM changes?

* Also, mark all devices managed by us explicitly, so it won't get in
* conflict with the system's udev rules that might ignore some devices
* in containers via usr/lib/udev/rules.d/85-nm-unmanaged-devices.rules */
GList* iter = np_state->netdefs_ordered;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: great work on the whole function body, very readable and well commented despite the complex logic!

* from the "match" stanza (e.g. original_name/mac/drivers)
* This will match the "old" interface (i.e. original MAC and/or
* interface name) if it got changed */
if (nd->has_match && (nd->match.original_name || nd->match.mac || nd->match.driver)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (non-blocking): this block could perhaps have been split off into its own function to make it a bit more bitesize ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh.. I was a bit quick with merging and didn't remember this suggestion. It makes sense to encapsulate this into its own function. I will probably work on this in a follow-up commit.

@Conan-Kudo
Copy link
Contributor

@thom311 is this reasonable?

@thom311
Copy link
Contributor

thom311 commented May 26, 2022

@thom311 is this reasonable?

yes. lgtm.

I am surprised that netplan aims to support a mixed system with networkd and NetworkManager. But sure.

@slyon
Copy link
Collaborator Author

slyon commented May 30, 2022

yes. lgtm.

Thank you for your feedback @thom311! merging.

I am surprised that netplan aims to support a mixed system with networkd and NetworkManager. But sure.

Yes. A combined setup can lead to funny situations, but we try our best to make it work.

@slyon slyon merged commit 7387cb8 into main May 30, 2022
@slyon slyon deleted the slyon/nm-accept-deny-list-lp1951653-FR-1888 branch May 30, 2022 07:19
@slyon
Copy link
Collaborator Author

slyon commented May 30, 2022

Small question though, where does that last commit come from? Has this bug surfaced due to the NM changes?

Yes. That bug surfaced after I implemented the NM changes, seems to be a timing issue that only happened (sometimes) inside our CI. I was not able to reproduce it locally on my machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants