Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/waitallprocesses config #1394

Merged

Conversation

robbertvdg
Copy link
Contributor

What this PR does / why we need it:
Adds the waitAllProcesses option to the katib-config metricsCollector spec. It is made optional, defaulting to true.

Which issue(s) this PR fixes
See #1383 for discussion

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:

Available waitAllProcesses option to katib-config's metrics-collector-sidecar spec.

@aws-kf-ci-bot
Copy link
Contributor

Hi @robbertvdg. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@robbertvdg
Copy link
Contributor Author

@andreyvelich can you take a look at this?

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this @robbertvdg!
I left few comments.
/ok-to-test
/cc @gaocegege @johnugeorge

@@ -107,7 +107,7 @@ var (
metricFilters = flag.String("f", "", "Metric filters")
pollInterval = flag.Duration("p", common.DefaultPollInterval, "Poll interval between running processes check")
timeout = flag.Duration("timeout", common.DefaultTimeout, "Timeout before invoke error during running processes check")
waitAll = flag.Bool("w", common.DefaultWaitAll, "Whether wait for all other main process of container exiting")
waitAllProcesses = flag.String("w", common.DefaultWaitAllProcesses, "Whether wait for all other main process of container exiting")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to keep this flag String, or we can use Bool?
I assume this flag can be only true or false.

Copy link
Contributor Author

@robbertvdg robbertvdg Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had problems with parsing booleans when a user has not specified the field in the katib-config. Do you know of a good way to do it?
I could leave it string in the katib-config and parse it to bool in GetMetricsCollectorConfigData, would that be better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robbertvdg It seems that empty bool in Go is false. So when user doesn't specify any values in config we send "false" to the metrics collector.
Maybe we can define MetricsCollectorConfig like this:

type MetricsCollectorConfig struct {
	Image            string                      `json:"image"`
	ImagePullPolicy  corev1.PullPolicy           `json:"imagePullPolicy"`
	Resource         corev1.ResourceRequirements `json:"resources"`
	WaitAllProcesses *bool                      `json:"waitAllProcesses"`
}

In that case, you can set -w flag if WaitAllProcesses != nil. Otherwise the default value is set.

In Katib config I was able to define bool value like this:

 "StdOut": {
        "image": "docker.io/andreyvelichkevich/file-metrics-collector",
        "imagePullPolicy": "Always",
        "waitAllProcesses": false
      },

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem lies with the v1.Container.args which is string[]. Therefore every container argument has to be a string, which I think is not compatible with the flag.Bool() argument (I tested it, doesn't work).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robbertvdg Got it.
In that case can we keep WaitAllProcesses *bool in MetricsCollectorConfig, convert it to string here:

args = append(args, "-w", metricsCollectorConfigData.WaitAllProcesses)

Then, use the String type like you did in Metrics Collector and convert it to Bool after.

In that case, we can avoid situation when user specify wrong values in Katib config.

What do you think @robbertvdg @gaocegege ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyvelich those were my thoughts as well, will update PR soon.

@@ -353,10 +353,16 @@ func main() {
go printMetricsFile(*metricsFilePath)
}

waitAll, err := strconv.ParseBool(*waitAllProcesses)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep it Bool, we can skip conversion.

Image string `json:"image"`
ImagePullPolicy corev1.PullPolicy `json:"imagePullPolicy"`
Resource corev1.ResourceRequirements `json:"resources"`
WaitAllProcesses string `json:"waitAllProcesses"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, can we keep it bool ?

Image string `json:"image"`
ImagePullPolicy corev1.PullPolicy `json:"imagePullPolicy"`
Resource corev1.ResourceRequirements `json:"resources"`
Image string `json:"image"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add omitempty for the fieds?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add it on all parameters but Image.

@robbertvdg
Copy link
Contributor Author

@andreyvelich @gaocegege updated PR, can you do a final check?

Image string `json:"image"`
ImagePullPolicy corev1.PullPolicy `json:"imagePullPolicy,omitempty"`
Resource corev1.ResourceRequirements `json:"resources,omitempty"`
WaitAllProcesses *bool `json:"waitAllProcesses,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add omitempty to the SuggestionConfig also please ? It should be in all parameters but Image.

@@ -38,6 +38,7 @@ def parse_options():
logger.addHandler(handler)
logger.propagate = False
opt = parse_options()
wait_all_processes = opt.wait_all_processes.lower() == "true"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this compare (opt.wait_all_processes.lower() == "true") necessary ?
I assume wait_all_processes can be only "true" or "false" since WaitAllProcesses *bool in Katib config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the python way of converting string to bool ;) should be fine since the string can only have two possibilities

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution @robbertvdg!
/lgtm
/cc @gaocegege

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich, robbertvdg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@andreyvelich
Copy link
Member

andreyvelich commented Nov 24, 2020

/hold
Hold for the @gaocegege review.

@gaocegege
Copy link
Member

/hold cancel
/lgtm

@gaocegege
Copy link
Member

Thanks for your contribution! 🎉 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants