Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parse-nm: Handle missing gateway in keyfile routes, keep dns-search fallback #238

Merged
merged 4 commits into from
Oct 11, 2021

Conversation

slyon
Copy link
Collaborator

@slyon slyon commented Oct 6, 2021

Description

NM assumes a scope: link route if the gateway is empty or unspecified (i.e. "0.0.0.0"/"::") in keyfile.
E.g. the route is only valid on the local network:
ip route add NETWORK dev DEVICE [metric METRIC]
see https://github.com/NetworkManager/NetworkManager/blob/main/src/libnm-core-impl/nm-keyfile.c#L520

netplan cannot differentiate between ipv4.dns-search and ipv6.dns-search so keep it in the passthrough/fallback list as an override.

Checklist

  • Runs make check successfully.
  • Retains 100% code coverage (make check-coverage).
  • New/changed keys in YAML format are documented.
  • (Optional) Adds example YAML for new feature.
  • (Optional) Closes an open bug in Launchpad.

@slyon slyon requested a review from schopin-pro October 6, 2021 12:19
@codecov-commenter
Copy link

codecov-commenter commented Oct 6, 2021

Codecov Report

Merging #238 (e697ee6) into main (bc6eb0b) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #238      +/-   ##
==========================================
- Coverage   99.06%   99.06%   -0.01%     
==========================================
  Files          57       57              
  Lines        9706     9702       -4     
==========================================
- Hits         9615     9611       -4     
  Misses         91       91              
Impacted Files Coverage Δ
tests/generator/test_routing.py 100.00% <ø> (ø)
src/nm.c 99.46% <100.00%> (-0.01%) ⬇️
src/parse-nm.c 100.00% <100.00%> (ø)
src/util.c 100.00% <100.00%> (ø)
tests/parser/base.py 100.00% <100.00%> (ø)
tests/parser/test_keyfile.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc6eb0b...e697ee6. Read the comment docs.

@schopin-pro
Copy link
Contributor

Just to be clear, this solves 2 separate issues, right?

schopin-pro
schopin-pro previously approved these changes Oct 7, 2021
Copy link
Contributor

@schopin-pro schopin-pro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although I'm not sure how I could easily test those changes :)

src/parse-nm.c Outdated Show resolved Hide resolved
src/parse-nm.c Outdated Show resolved Hide resolved
@slyon
Copy link
Collaborator Author

slyon commented Oct 7, 2021

Just to be clear, this solves 2 separate issues, right?

Yes, two small keyfile incompatibilities that sneaked into the 0.103 release.

Thank you for the review! I will do some extra testing by compiling a netplan v0.103+this commit and running the NetworkManager snap (incl. netplan/keyfile patches) test suite against it, which showed the failures before.

src/parse-nm.c Outdated
else {
/* NM assumes a "default" route if this is not defined in the keyfile.
* See nm-settings-ip-config.c -> nm_ip_route_set_next_hop()/canonicalize_ip(). */
route->via = g_strdup("default");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in a private chat, I have doubts regarding this. The concept of "default" to designate a gateway doesn't make sense to me, and we suspect this is actually used to specify local routes

@schopin-pro schopin-pro dismissed their stale review October 7, 2021 12:28

Approval superseded by doutes regarding the default via thing.

@slyon slyon changed the title parse-nm: Handle "default" unicast routes, keep dns-search fallback parse-nm: Handle missing gateway in keyfile routes, keep dns-search fallback Oct 7, 2021
slyon and others added 2 commits October 7, 2021 18:11
…allback

NM assumes a route to use the unspecified address as the gateway
(via = "0.0.0.0"/"::") if none is specified in the keyfile.
E.g. the route is only valid on the local network:
"ip route add NETWORK dev DEVICE [metric METRIC]"

netplan cannot differentiate between ipv4.dns-search and ipv6.dns-search
so keep it in the passthrough/fallback list as an override.
Co-authored-by: Simon Chopin <[email protected]>
@slyon
Copy link
Collaborator Author

slyon commented Oct 7, 2021

@schopin-pro Thank you for the review and discussion around this. WIth my latest changes it now passes the NetworkManager unit tests (using the patched keyfile-netplan plugin): https://paste.ubuntu.com/p/5jDnTDCmdF/

I've updated the commit and PR description with the explanation, and also added some comment inside the code. May I ask for another sanity check on this?

@slyon slyon requested a review from schopin-pro October 7, 2021 16:21
Copy link
Contributor

@schopin-pro schopin-pro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version makes more sense, but I'm not liking the new code much. Using an invalid unicast IP to signal that there are no gateway will cause trouble at some point, and isn't consistent with the rest of the code, and I'm thinking we should update the scope field if we know we're dealing with a local route.

src/parse-nm.c Outdated Show resolved Hide resolved
@slyon
Copy link
Collaborator Author

slyon commented Oct 8, 2021

ACK, fair point!
NetworkManager is kind of special in this regard, as it does not have an understanding of "scope" in it's keyfile format, but it auto-detects a route's scope depending on destination IP and gateway fields. If gateway/via is empty or unspecified (e.g. ""/"0.0.0.0"/"::") it assumes a "link" or "host" scope, while it assumes a "global" scope if the gateway is valid. Up to now netplan has been ignoring this fact and just errored out if the scope was not "global".

But you're absolutely right that this implementation detail of NM should not leak into the netplan YAML schema by specifying an invalid "via" field. But rather we want to keep that local to the NM generator backend. So I've added another commit that reworks the scope logic of the NM backend (and relevant unit- & integration tests) accordingly, to accept all values ("global"/"link"/"host") and write the keyfile accordingly (proper gateway/via field or unspecified address if scope = "link" or "host"), tricking NM to do the right thing.

It still passes NM's test suite and also passes the newly added test_link_route_v4 integration tests on the TestNetworkManager class:

$ time autopkgtest ../netplan.io_0.103-0ubuntu7+test0_amd64.changes -U --test-name=routing -- qemu /data/autopkgtest-impish-amd64.img 
autopkgtest [16:32:51]: starting date: 2021-10-08
autopkgtest [16:32:51]: version 5.16ubuntu1
[...]
autopkgtest [16:39:23]: test routing: python3 tests/integration/run.py --test=routing
autopkgtest [16:39:23]: test routing: [-----------------------
test_link_route_v4 (__main__.TestNetworkManager) ... eth42 .ok
test_per_route_advertised_receive_window (__main__.TestNetworkManager) ... eth42 .ok
test_per_route_congestion_window (__main__.TestNetworkManager) ... eth42 .ok
test_per_route_mtu (__main__.TestNetworkManager) ... eth42 .ok
test_route_from (__main__.TestNetworkManager) ... eth42 .ok
test_route_on_link (__main__.TestNetworkManager) ... eth42 ..ok
test_route_table (__main__.TestNetworkManager) ... eth42 .ok
test_routes_default (__main__.TestNetworkManager) ... eth42 .ok
test_routes_v4 (__main__.TestNetworkManager) ... eth42 .ok
test_routes_v4_with_dhcp (__main__.TestNetworkManager) ... skipped 'fails due to networkd bug setting routes with dhcp'
test_routes_v6 (__main__.TestNetworkManager) ... eth42 ..ok
test_link_route_v4 (__main__.TestNetworkd) ... eth42 ..............ok
test_per_route_advertised_receive_window (__main__.TestNetworkd) ... eth42 ..............ok
test_per_route_congestion_window (__main__.TestNetworkd) ... eth42 .............ok
test_per_route_mtu (__main__.TestNetworkd) ... eth42 .............ok
test_route_from (__main__.TestNetworkd) ... eth42 ..............ok
test_route_on_link (__main__.TestNetworkd) ... eth42 ..............ok
test_route_table (__main__.TestNetworkd) ... eth42 ..............ok
test_route_type_blackhole (__main__.TestNetworkd) ... skipped 'networkd does not handle non-unicast routes correctly yet (Invalid argument)'
test_route_with_policy (__main__.TestNetworkd) ... eth42 .............ok
test_routes_default (__main__.TestNetworkd) ... eth42 .............ok
test_routes_v4 (__main__.TestNetworkd) ... eth42 ..............ok
test_routes_v4_with_dhcp (__main__.TestNetworkd) ... skipped 'fails due to networkd bug setting routes with dhcp'
test_routes_v6 (__main__.TestNetworkd) ... eth42 ..............ok

----------------------------------------------------------------------
Ran 24 tests in 201.437s

OK (skipped=3)
autopkgtest [16:42:46]: test routing: -----------------------]
autopkgtest [16:42:46]: test routing:  - - - - - - - - - - results - - - - - - - - - -
routing              PASS
autopkgtest [16:42:47]: @@@@@@@@@@@@@@@@@@@@ summary
routing              PASS
qemu-system-x86_64: terminating on signal 15 from pid 653970 (/usr/bin/python3)

real	9m57,131s
user	6m37,751s
sys	0m25,471s

@slyon slyon requested a review from schopin-pro October 8, 2021 15:06
NetworkManager automatically detects a route's scope, depending on
destination IP ("to") and gateway ("via"). If no gateway is specified
(e.g. the unspecified address "0.0.0.0"/"::" in keyfile) it will assume
a "link"/"host" scope, otherwise it will assume a "global" scope.
Copy link
Contributor

@schopin-pro schopin-pro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the NM code and comments, I think we can get rid of the whole undefined address part, using a simple empty string instead. I'm not entirely sure '0.0.0.0' would work as intended, but the code clearly supports having an empty string.

src/nm.c Outdated
g_debug("%s: Overriding 'via: %s' as NetworkManager does not support "
"setting a route's scope directly, but will auto-detect them.",
def->id, get_unspecified_address(cur_route->family));
via = get_unspecified_address(cur_route->family);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not what I read there https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/main/src/libnm-core-impl/nm-keyfile.c#L506

Wouldn't we want

Suggested change
via = get_unspecified_address(cur_route->family);
via = "";

instead ?

Also avoid the deprecated trailing comma notation at the same time.
@slyon
Copy link
Collaborator Author

slyon commented Oct 11, 2021

Alright. Using the unspecified address does work as intended and is used in some (legacy) NM keyfiles. But you're right that using the empty string is the cleaner and currently suggested approach. BUT: we also need to make sure that we do not end up with a trailing comma (e.g. route1=1.2.3.4/24, if via="" and no metric is given), as the NetworkManager unit-tests complain about this deprecated notation and fail because of it (breaking the NetworkManager snap build, that uses libnetplan as its backend).

So in my most recent commit I've reworked the keyfile generation logic in nm.c to use an empty string for the gateway filed if the route doesn't have a "global" scope, but it does not add an empty gateway if no metric is give, to avoid the trailing comma.

netplan's integration tests and NM's unit-tests still pass.

@schopin-pro schopin-pro merged commit e205f06 into main Oct 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants