-
Notifications
You must be signed in to change notification settings - Fork 880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AutoML WG and Kubeflow 1.5 release #2106
Comments
From the versioning issue we had we know we are targeting 0.13 #2098 (comment). @kubeflow/wg-automl-leads let's use this issue for further updates, new tags, progressing issues etc. |
Hi @kubeflow/wg-automl-leads , Before the manifest testing on Wednesday, Feb 9th, the release team is planning on cutting another RC to use for the testing. Based on a previous communication, the release team will be using AutoML version 0.13rc0. If the AutoML WG have identified any issues since the feature freeze and would like to update the AutoML version before the manifest testing, let us know before Feb. 9th. Thank you! |
After syncing in today's AutoML we will keep on using the Also another note, the @kubeflow/wg-automl-leads will update the kubeflow/katib e2e tests to be using the |
deployed kubeflow from v1.5-branch and ran this example: https://github.com/kubeflow/katib/blob/master/examples/v1beta1/kubeflow-pipelines/kubeflow-e2e-mnist.ipynb I found the metric collector is not injected into the trial pod:
Does anyone have the same issue? not sure if this is the right place to discuss/report this. BTW, early-stop sample works well and I do see metric collector container was injected:
|
Thanks for raising this @yhwang! I also bumped into this when writing the e2e tests The fix for this should be to use We also discussed this in this week's AutoML meeting, and we'll expose the full list of annotations/changes users need to keep in mind for the new 1.4 version of the Training Operators. |
thanks @kimwnasptd I tried |
Haven't bumped into this, in my case with a KinD 1.20 cluster all the trials got to Can you open a distinct issue in the kubeflow/katib so that we can get more deep into it? I'll also start using Prow for the e2e tests with AWS clusters in the manifests repo, I'll give a heads up if I bump into this. |
forgot to update you on my latest status of katib. the problem seems to be a tfjob from previous run got stuck in a weird state. after I removed that job, my katib works well. thanks for the script and hint. |
@andreyvelich @johnugeorge @gaocegege I'm working on finalizing the manifests for the release, as we are getting closer to the release date of March 9th. Regarding the |
@kimwnasptd . we will do it this week |
Just saw it's ready. Congrats on the release 🎉 |
Hey folks, any docs changes required as a result of this work? Please create an issue and mention it on this tracking issue. |
This effort has been finalised. |
@kubeflow/wg-automl-leads let's use this tracking issue to coordinate the integration of AutoML with the Kubeflow 1.5 release.
First off a heads up that the feature freeze phase will start Tuesday (25th January). Before then I'd like to have updated this repo with the manifests of the
kubeflow/katib
repo, in order to be able to cut the first RC tag in this repo.So what I'd like to ask as a first step before the feature freeze is:
kubeflow/katib
?The text was updated successfully, but these errors were encountered: