Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to specify default tolerations via the buildconfig defaulter #17955

Merged
merged 1 commit into from
Jan 6, 2018
Merged

Ability to specify default tolerations via the buildconfig defaulter #17955

merged 1 commit into from
Jan 6, 2018

Conversation

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 2, 2018
@coreydaley coreydaley changed the title Ability to specify default tolerations via the buildconfig defaulter [WIP] Ability to specify default tolerations via the buildconfig defaulter Jan 2, 2018
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 2, 2018
@coreydaley
Copy link
Member Author

@gabemontero
Copy link
Contributor

ah - just saw the WIP tag @coreydaley - only looked cause the bot assigned me and I was on auto pilot

@gabemontero gabemontero removed their request for review January 2, 2018 14:27
@coreydaley coreydaley changed the title [WIP] Ability to specify default tolerations via the buildconfig defaulter Ability to specify default tolerations via the buildconfig defaulter Jan 2, 2018
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 2, 2018
@coreydaley
Copy link
Member Author

@openshift/sig-developer-experience ptal


// tolerations is a list of default Tolerations that are applied to build pods
// so that they can be placed on build-specific nodes
Tolerations []kapi.Toleration `json:"tolerations,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k should this whole file not have moved into the api repo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k should this whole file not have moved into the api repo?

The admission ones need to move. Found it late last month. Is this still configured via admission? Try to fix that in 3.9, so we can fix the config in 3.10.

@coreydaley
Copy link
Member Author

@openshift/api-review ptal


// tolerations is a list of default Tolerations that are applied to build pods
// so that they can be placed on build-specific nodes
Tolerations []kapiv1.Toleration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal api objects should not be using versioned k8s api types.

@@ -100,6 +100,12 @@ func (b BuildDefaults) applyPodDefaults(pod *v1.Pod) {
}
}
}

if pod.Spec.Tolerations == nil && len(b.config.Tolerations) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if pod.Spec.Tolerations is nil, you should be creating a new array and populating it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(actually I think w/ arrays you can just always append, you don't need to explicitly instantiate the empty array first)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but regardless you should be adding the default tolerations to the array. the defaulting should be done on a key by key basis. Same way the annotations are being done currently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nil and zero length are equivalent here, and you should check zero length


if pod.Spec.Tolerations == nil && len(b.config.Tolerations) != 0 {
for _, toleration := range b.config.Tolerations {
pod.Spec.Tolerations = append(pod.Spec.Tolerations, toleration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're calling this a defaulter, you should make sure the config.Toleration.Key does not exist in pod.Spec.Tolerations before appending it. (there's no reason to expect it to today, but that could change someday).

@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 3, 2018
@coreydaley
Copy link
Member Author

@bparees feedback addressed, ptal

@@ -100,6 +103,14 @@ func (b BuildDefaults) applyPodDefaults(pod *v1.Pod) {
}
}
}

// Apply default Tolerations
if len(pod.Spec.Tolerations) == 0 && len(b.config.Tolerations) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we decided you should append these even if pod.Spec.Tolerations already contains some tolerations?

tol := v1.Toleration{}

if err := kapiv1.Convert_core_Toleration_To_v1_Toleration(&t, &tol, nil); err != nil {
panic(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can't call panic, that would crash the api server. util.HandleErr to log it, and ultimately you need to bubble this error up to ApplyDefaults so it can be returned from there.

@coreydaley
Copy link
Member Author

@bparees comments addressed

tol := v1.Toleration{}

if err := kapiv1.Convert_core_Toleration_To_v1_Toleration(&t, &tol, nil); err != nil {
panic(err)
utilruntime.HandleError(fmt.Errorf("Unable to convert core.Toleration to v1.Toleration: %v", err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should return the same error you're logging.

@@ -218,18 +226,20 @@ func addDefaultAnnotation(k, v string, annotations map[string]string) {
}
}

func addDefaultToleration(t kapi.Toleration, tolerations *[]v1.Toleration) {
func addDefaultToleration(t kapi.Toleration, tolerations *[]v1.Toleration) error {
tol := v1.Toleration{}

if err := kapiv1.Convert_core_Toleration_To_v1_Toleration(&t, &tol, nil); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k sanity check for the right way to do this.

@coreydaley
Copy link
Member Author

/retest

1 similar comment
@coreydaley
Copy link
Member Author

/retest

@bparees
Copy link
Contributor

bparees commented Jan 4, 2018

@openshift/api-review bump.

@@ -43,6 +43,10 @@ type BuildDefaultsConfig struct {

// resources defines resource requirements to execute the build.
Resources kapi.ResourceRequirements `json:"resources,omitempty"`

// tolerations is a list of default Tolerations that are applied to build pods
// so that they can be placed on build-specific nodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs good doc about what happens in the following cases:

  • no tolerations on the pod, tolerations set here (assume these are all added verbatim)
  • unrelated tolerations on the pod already, tolerations set here (assume these are appended)
  • some of these toleration keys exist on the pod already, but with different values/effects (do these override or preserve what is already set on the pod?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt As of now, the only way that tolerations are set on build pods is through the defaulter, so there would be no other tolerations. If/when there comes a time that users (or whoever) can add tolerations, then all of these things should be considered and addressed, but for now it might be overkill?
@bparees thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now it makes sense to state that these are appended to any existing tolerations on the build pod.

if there are multiple tolerations with the same key but different values, i would expect us to forgo adding the default toleration entry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(note that even the key is optional in a Toleration, so it's quite a mess deciding when two tolerations should replace each other vs both be added)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bparees updated in next push

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bparees tolerations seem to be processed as a combination of the key/value/effect exactly equaling the taint, or the key/effect existing, and the value being blank, so it seems like skipping a toleration based on the key matching would not be the best option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bparees updated description based on your feedback

@@ -46,8 +46,7 @@ type BuildDefaultsConfig struct {
// resources defines resource requirements to execute the build.
Resources kapi.ResourceRequirements

// tolerations is a list of default Tolerations that are applied to build pods
// so that they can be placed on build-specific nodes
// tolerations is a list of default Tolerations that are appended to build pods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are appended to the build pod's set of tolerations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in current push

@coreydaley
Copy link
Member Author

flake #17995

@coreydaley
Copy link
Member Author

/retest

2 similar comments
@coreydaley
Copy link
Member Author

/retest

@coreydaley
Copy link
Member Author

/retest

@coreydaley
Copy link
Member Author

@openshift/api-review ptal at the api changes and let me know if they look ok

@bparees
Copy link
Contributor

bparees commented Jan 5, 2018

It does raise the question, why is the node selector even in the defaults at all?

it made more sense when it actually operated as an admission controller since other things might have put nodeselector values on the pod before the defaulter+overrider saw the pod object.

And given that nodeselector does exist as both an override and a default, that would seem to further the argument that the Toleration annotation could reasonably be placed in either (it could also be placed in both, if you really want it, but it would be pointless).

@liggitt
Copy link
Contributor

liggitt commented Jan 5, 2018

And given that nodeselector does exist as both an override and a default, that would seem to further the argument that the Toleration annotation could reasonably be placed in either (it could also be placed in both, if you really want it, but it would be pointless).

Thinking about how a user would have to configure this to get build isolation, it's just odd to make them configure both defaults (containing tolerations) and overrides (containing selectors).

@bparees
Copy link
Contributor

bparees commented Jan 5, 2018

@openshift/api-review isn't going to approve this w/o moving the Tolerations value into the overrides config (which means changing the behavior such that the override Tolerations array completely replaces the existing Tolerations array on the pod, if any). @coreydaley please rework the PR accordingly so we can land this for devcut.

@coreydaley
Copy link
Member Author

@bparees ptal
migrated the code over to overrides

@coreydaley
Copy link
Member Author

/retest

for _, toleration := range b.config.Tolerations {
t := v1.Toleration{}

if err := kapiv1.Convert_core_Toleration_To_v1_Toleration(&toleration, &t, nil); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k is this an ok way to convert from the internal toleration object to external?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

for i, toleration := range tolerations {
tol := v1.Toleration{}
if err := kapiv1.Convert_core_Toleration_To_v1_Toleration(&toleration, &tol, nil); err != nil {
panic(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you "can" panic here, it's not going to make it easier to investigate the failure. just fail the test w/ the error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@bparees
Copy link
Contributor

bparees commented Jan 5, 2018

@openshift/api-review location of the tolerations field has been moved. please review.

@liggitt
Copy link
Contributor

liggitt commented Jan 5, 2018

config api change lgtm

@bparees
Copy link
Contributor

bparees commented Jan 5, 2018

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 5, 2018
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bparees, coreydaley

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 5, 2018
@coreydaley
Copy link
Member Author

/retest

@coreydaley coreydaley removed the request for review from abhgupta January 5, 2018 21:53
@coreydaley
Copy link
Member Author

/retest

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 6, 2018

@coreydaley: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/extended_builds bf3275f link /test extended_builds

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue.

@openshift-merge-robot openshift-merge-robot merged commit c8c51ba into openshift:master Jan 6, 2018
mjudeikis pushed a commit to mjudeikis/origin that referenced this pull request Jan 7, 2018
…t_tolerations_via_buildconfig_defaulter

Automatic merge from submit-queue.

Ability to specify default tolerations via the buildconfig defaulter
openshift-merge-robot added a commit to openshift/openshift-ansible that referenced this pull request Jan 11, 2018
…ons_via_buildconfig_defaulter

Automatic merge from submit-queue.

Ability to specify default tolerations via the buildconfig defaulter

Trello: https://trello.com/c/LNxlMjjU/1435-5-ability-to-specify-default-tolerations-via-the-buildconfig-defaulter-builds

Dependent on:
openshift/origin#17955
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-approved approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. needs-api-review size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants