-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore bad NetDefs and files via parser flags #412
Ignore bad NetDefs and files via parser flags #412
Conversation
148c0b5
to
907163e
Compare
ab633f2
to
e4eecd0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that was a fun lunch break :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the overall approach of this. It is a slight change in behavior, but only for the better, as a non-configured network (due to parsing errors) is always a bad thing. It could lead to unexpected network configuration if some files are ignored but other aren't, but IMO that is still better than no network configuration at all.
cb84ed4
to
fe9f371
Compare
Thanks for your comments, lads. I basically adopted all of them. I decided to not drop bad netdefs anymore. I think it's actually better to keep them and generate configuration for what was parsed until the point it failed. There is a chance the interface will still be brought up and work if the error happened after settings IPs or DHCP for example. To make sure (or at least increase the confidence) that these changes will not cause a change in behavior when the flag is enabled and all the configuration is good, I changed our The new "Configuration fuzzing" CI action is catching a memory leak! As far as I can tell, it's happening inside glib and there is a comment in the code that handles datalists saying it could happen, see FR-5666 for more details. So, this leak is not introduces by these changes. I prepared a PPA and added to suggestions of tests in the PR comments. |
fe9f371
to
f211c3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for working through all the remarks. This should be relatively safe now.
The doubling of testing-time is a though sell... OTOH more coverage is better. But at least it should be easy to disable (e.g. through ENV variable), so we can skip the double-testing for local tests.
I left a bunch of additional inline comments that need additional consideration.
f211c3c
to
37de4f2
Compare
Parser flags are intended to be used to change the parser's behavior. This patch introduces a single flag used to instruct the parser that parsing errors should be ignored and it should continue to parse netdefs. The main goal with this flag is to make Netplan more resilient. Currently, any mistake made in the YAML files will prevent Netplan to generate configuration. So, rebooting the system while one of the YAML files are broken will result in the system having no network connectivity.
ce7a9fd
to
711677b
Compare
af92b75
to
9b7340d
Compare
Thanks Lukas, I tried to address all of your comments. I added an error counter to the parser state instead of a boolean, that might be more useful I think. The first commit was split into the definition and implementation of flags as suggested by @schopin-pro in a conversation we had. The config_fuzzer was changed to run the generator with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the remarks! lgtm.
I like the error handling to be integrated with existing error handling logic and the .error_count
field.
When the ignore_errors flag is enabled, the parser will ignore YAML files that failed to be loaded due to syntax errors. YAMLs that are loaded successfully but contain invalid Netplan configuration will not be completely ignored. Network definitions that contain errors will not be completely ignored. The parser will ignore the error in the bad YAML mapping and continue with the next one. By doing it, all the good definitions will produce backend configuration and the bad ones will produce *some* backend configuration. Add a simple counter to the parser struct that will store the number of errors that were ignored. This can be used by callers to check if there were errors when the IGNORE_ERRORS flag is used.
Add a new flag --ignore-errors. It it's used mostly for tests. IGNORE_ERRORS will be set either if --ignore-errors or the environment variable NETPLAN_PARSER_IGNORE_ERRORS are set. These are mostly used for tests from the unit and integration tests. The flag will be set by default if "generate" is called as a systemd generator. The idea is to ignore errors when the machine boots up so the system still has a chance to have a working network configuration if an issue is introduced in any of the YAML files. For the time being, this flag will not be enabled by default when the CLI is used. This will make the issue evident so the user will need to fix it.
Add a "flags" property that can be used as a getter and setter for flags. Add the error_count getter to Parser. Add some unit tests using the new properties.
config_fuzzer: run generate against the whole dataset generated by the config fuzzer and ignore all the errors. integration/run.py: run integration tests a second time with ignore_errors set via the NETPLAN_PARSER_IGNORE_ERRORS environment variable. integration/dbus.py: call apply after set. Without that, the netplan state will be considered dirty and the dbus tests will fail when called a second time.
Add a new section to "Explanation" about the generator behavior. Update netplan-generate(8) to mention the new generator behavior.
9b7340d
to
29bfaa6
Compare
Description
This is an implementation of what I called "parser flags". The idea is to enable the user to change some parsing decisions.
The main application at the moment is to support ignoring parsing errors. Currently Netplan will not generate any configuration if a little mistake is made in one of its YAML files. It can be really bad if you're operating a remote system and end up rebooting the system with a syntax issue in one of your files. Your system will basically boot without any network configuration.
The IGNORE_ERRORS flag will allow the Netplan generator to ignore bad files and bad netdefs so it will still generate some configuration.
TESTS
Integration tests will be executed twice now, once without the ignore_errors flags and once with it enabled. It's intended to increase the confidence that the parser behaves as expected when the flag is enabled and the configuration is good.
I prepared a PPA for Ubuntu Noble with this patch: https://launchpad.net/~danilogondolfo/+archive/ubuntu/netplan-parser-flags
Suggestions of tests:
It can be easily tested in a LXD VM. You can add some broken configuration and reboot the VM or call
/usr/libexec/netplan/generate -i
to see the parsing process when it's ignoring errors50-cloud-init.yaml
and reboot the VM. For example, the config below should still bring enp5s0 up and with DHCP working.eth2
should still have a backend configuration file and it should containBond=bond0
.br0
interface created:Checklist
make check
successfully.make check-coverage
).