Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Introduce k8s-service-info between katib-controller and katib-db-manager #185

Merged
merged 18 commits into from
Jun 11, 2024

Conversation

orfeas-k
Copy link
Contributor

@orfeas-k orfeas-k commented May 31, 2024

This PR introduces a k8s-service-info between katib-controller and katib-db-manager. This addresses the issue below. Read the summary comment for a quick overview of the issue.

Note that this means that katib-controller will require a relation to katib-db-manager in order to start its pebble service, which we will need to introduce in the latest/edge bundle.

Testing

  1. Deploy AKS cluster following the guide. You can always check the deploy-to-aks workflow for the commands the CI follows. Note that Juju 3.5 should be used but right now is broken thus 3.4 is probably a better option (didn't try this though myself). Another option could be using channel 3/stable which points to 3.5.0 which is not broken (apart from the issue with kubeflow-dashboard which you can workaround by refreshing to upstream image).
  2. Update the latest/edge bundle.yaml as follows
@@ -64,11 +64,12 @@ applications:
     _github_repo_branch: main
   katib-controller:
     charm: katib-controller
-    channel: latest/edge
+    channel: latest/edge/pr-185
     scale: 1
     trust: true
     _github_repo_name: katib-operators
     _github_repo_branch: main
+    base: [email protected]
   katib-db:
     charm: mysql-k8s
     channel: 8.0/edge
@@ -79,11 +80,12 @@ applications:
     _github_dependency_repo_branch: main
   katib-db-manager:
     charm: katib-db-manager
-    channel: latest/edge
+    channel: latest/edge/pr-185
     scale: 1
     trust: true
     _github_repo_name: katib-operators
     _github_repo_branch: main
+    base: [email protected]
   katib-ui:
     charm: katib-ui
     channel: latest/edge
@@ -297,6 +299,7 @@ relations:
   - [istio-pilot:istio-pilot, istio-ingressgateway:istio-pilot]
   - [istio-pilot:ingress, tensorboards-web-app:ingress]
   - [istio-pilot:gateway-info, tensorboard-controller:gateway-info]
+  - [katib-controller, katib-db-manager]
   - [katib-db-manager:relational-db, katib-db:database]
   - [kfp-api:relational-db, kfp-db:database]
   - [kfp-api:kfp-api, kfp-persistence:kfp-api]
  1. Deploy kubeflow using the tox environment tests are using as well
tox -e test_bundle_deployment-latest -- --model kubeflow --keep-models -vv -s

This uses the edited bundle.yaml

  1. Create a notebook and run katib UAT

Closes canonical/bundle-kubeflow#893

@orfeas-k orfeas-k force-pushed the kf-5649-patch-katib-env-variables branch from fe159ba to 205a140 Compare June 4, 2024 12:49
@orfeas-k orfeas-k changed the title fix: Explicitly set KATIB_DB_MANAGER_SERVICE_PORT in katib-controller fix: Introduce k8s-service-info between katib-controller and katib-db-manager Jun 4, 2024
@orfeas-k orfeas-k marked this pull request as ready for review June 5, 2024 10:03
@orfeas-k orfeas-k requested a review from a team as a code owner June 5, 2024 10:03
Copy link
Contributor

@NohaIhab NohaIhab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @orfeas-k, left some comments

this k8s_service_info_requirer_component module looks reusable, I think it can live in chisme, WDYT?

charms/katib-controller/requirements.in Show resolved Hide resolved
charms/katib-controller/src/charm.py Outdated Show resolved Hide resolved
charms/katib-controller/tests/integration/test_charm.py Outdated Show resolved Hide resolved
charms/katib-controller/tests/integration/test_charm.py Outdated Show resolved Hide resolved
@orfeas-k orfeas-k changed the base branch from main to orfeas-k-dev-katib-controller-db-manager-relation June 10, 2024 09:31
@orfeas-k
Copy link
Contributor Author

Replied to all comments. Regarding k8s_service_info_requirer_component, it's definitely a resuable component since I copied that from mlmd/envoy but we have the following limitation canonical/charmed-kubeflow-chisme#94 which we would have to solve first and then see if it makes sense to add this to chisme.

NohaIhab
NohaIhab previously approved these changes Jun 11, 2024
Copy link
Contributor

@NohaIhab NohaIhab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx @orfeas-k !

@orfeas-k orfeas-k changed the base branch from orfeas-k-dev-katib-controller-db-manager-relation to main June 11, 2024 09:14
@orfeas-k orfeas-k dismissed NohaIhab’s stale review June 11, 2024 09:14

The base branch was changed.

@NohaIhab
Copy link
Contributor

On second thought discussing with @orfeas-k , we need to merge this PR to main before sending a follow-up PR to modify the tests to not use the local charm. This is because we need the new version of katib-db-manager to be released to latest/edge before we can do the change in the tests.

@orfeas-k orfeas-k merged commit c1a553e into main Jun 11, 2024
15 checks passed
@orfeas-k orfeas-k deleted the kf-5649-patch-katib-env-variables branch June 11, 2024 09:20
orfeas-k added a commit that referenced this pull request Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ci(aks): Katib UAT fail on AKS
2 participants