-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix defaulting of legacy ClusterNetwork fields #16897
Fix defaulting of legacy ClusterNetwork fields #16897
Conversation
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship, eparis, knobunc The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
@danwinship: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
merged by hand because we are literally breaking cluster upgrades. |
…tage Automatic merge from submit-queue. Fix defaulting of legacy ClusterNetwork fields Backport of #16897
allErrs = append(allErrs, field.Invalid(field.NewPath("hostsubnetlength"), clusterNet.HostSubnetLength, "hostsubnetlength must be identical to clusterNetworks[0].hostSubnetLength")) | ||
} | ||
} else if clusterNet.Network != "" || clusterNet.HostSubnetLength != 0 { | ||
if clusterNet.Network != clusterNet.ClusterNetworks[0].CIDR || clusterNet.HostSubnetLength != clusterNet.ClusterNetworks[0].HostSubnetLength { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to enforce this condition for non default cluster network? May be we shouldn't care.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would it mean to have Network set but not equal to ClusterNetworks[0]?
(I had suggested a long time back in the original PR that we should keep using Network/HostSubnetLength for the primary network, and only use ClusterNetworks for additional networks. But it's too late to change that now.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, OpenShift sdn only cares about default cluster network but based on what we described in swagger doc for Network/ClusterNetworks, now it seems reasonable to expect network to match clusterNetworks[0] for other use cases as part of api validation.
} | ||
master.networkInfo, err = common.ParseNetworkInfo(clusterNetworkEntries, networkConfig.ServiceNetworkCIDR) | ||
if err != nil { | ||
return err | ||
} | ||
if len(clusterNetworkEntries) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition could be moved before common.ParseNetworkInfo()
} | ||
master.networkInfo, err = common.ParseNetworkInfo(clusterNetworkEntries, networkConfig.ServiceNetworkCIDR) | ||
if err != nil { | ||
return err | ||
} | ||
if len(clusterNetworkEntries) == 0 { | ||
panic("No ClusterNetworks set in networkConfig; should have been defaulted in if not configured") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to panic here? returning an error will propagate to the SDN RunController() and will fail as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can only happen if someone breaks the config-reading code. It's basically a "can't happen", I just put the check+panic in to make it slightly clearer what was going than if we just let them get an out-of-bounds exception when we dereference clusterNetworkEntries[0] below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't suggesting allowing de-referencing clusterNetworkEntries[0] but to return a normal error instead of panic because it does the same job of stopping the sdn controller. I thought panic could be used when we are too deep in the code and just returning an error will not suffice (like stopping the controller).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought panic could be used when we are too deep in the code and just returning an error will not suffice (like stopping the controller).
Hm... I tend to think of it more as error
is for runtime errors ("someone wrote bad values into the config file") and panic
is for devel-time errors ("this code assumes that default values will have been filled in, but they weren't"). The Go spec is not especially clear about this, although "Effective Go" notes that panic
can be "a way to indicate that something impossible has happened", which is what I was going for here.
Core logic looks good and fixes the issue. |
3.6 nodes can't currently start up against a 3.7 master (eg, during upgrade) because the old ClusterNetwork fields aren't set. (Some absent-minded reviewer had even noticed that this was broken and then not fixed yet (#14558 (comment)) and then approved the PR anyway.)
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1502866
cc @eparis