Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(knative, kfserving, cert-manager) Build working solution for kfserving integration #212

Merged

Conversation

zijianjoy
Copy link
Collaborator

@zijianjoy zijianjoy commented Apr 18, 2021

Fix: #209
Fix: #222
Fix: #176

Partial: #221

Using the following versions:

KFserving 0.5.1
KNative 0.22.0
Cert-manager 1.3

This is the only combination which is validated to be working for KFServing integration.

KFServing team recommended KNative 0.17.4, but the kubeflow/manifests didn't update the manifest for this version, plus that KNative official doc has already archived this version. I tried to install this version but installation doesn't work for inferenceService deployment.

David has a PR for cert-manager upgrade kubeflow/manifests#1820, but it needs some adjustment for successful deployment.

Tried to adopt KNative installation in kustomize build rule but kustomize complains about existing registered ID for serving-core.yaml file. Currently we use command line for installing kNative.

Copy link
Contributor

@Bobgy Bobgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Looks good to me!
You can decide when to unhold
/hold

apply-knative:
# Common Knative
# It has error when trying to patch the resource downloaded from knative because it doesn't match kustomize requirement.
# Original release of knative manifest can not be patched by kustomize because of `already registered id`. For now we run kubectl directly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create an issue to log this problem with a bit more details?

kubernetes-sigs/kustomize#1251

My understanding is that there are duplicate resources between several kustomize folders. Are we including them correctly? If yes, maybe we should file an upstream issue?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 2 issues that I came across with regards to the knative manifests. The first is a bug in kustomize causing YAML anchors to not work properly. The second is that the serving-core.yaml contains all of the CRDs that are in the serving-crds.yaml file. As such, you can't use both when using Kustomize, but you might run into a chicken & egg so you need to apply the manifests twice if your not using something that continuously syncs like Argo CD.

I expanded the manifests to remove the YAML anchors and created the following kustomize folder for the Argo CD installation: argoflow/argoflow@ce115f0.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Yuan and David for the suggestion! Indeed if I include both serving-crd.yaml and serving-core.yaml, it will fail with duplicated definition. However, I encountered another issue when using only serving-core.yaml. I created an issue in #217.

For the manifest expand which David has made, do you think it makes sense to upload this expanded serving-core.yaml to kubeflow/manifests repo? Otherwise every distribution needs to perform such change to a knative source file.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe a decision was made to not update KNative (or cert-manager for that matter) in the manifests repo before the 1.3 release. So I believe the route forward is to add these manifests in the gcp-blueprints repo directly for now, and try and get the fixes upstreamed after the release.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for clarifying, let's merge the fix in kubeflow/manifests after the release. We will do the patch for now on gcp-blueprints.

@davidspek
Copy link

davidspek commented Apr 18, 2021

I've also updated the Argo CD installation to knative 0.22.0. The problem with the existing registered ID, is that the knative-core.yaml contains all the entries of the knative-crds.yaml. In the following commit you can see a working kustomize folder for knative 0.22.0. argoflow/argoflow@ce115f0

Regarding cert-manager, I'm not sure what trouble you were experiencing with the installation. But I'm around on Slack to debug this as it worked fine with plane kustomize for me. KFServing apparently requires cert-manager 0.12+ so I think it is important to fix this.

@davidspek
Copy link

@zijianjoy Thought I’d link to my comment about the problem you were having with my cert-manager PR in case you missed it. kubeflow/manifests#1820 (comment)

@connorlwilkes
Copy link

connorlwilkes commented Apr 20, 2021

Just reading this through and was wondering why is there a difference in knative version here vs in manifests repository?

@zijianjoy
Copy link
Collaborator Author

Just reading this through and was wondering why is there a difference in knative version here vs in manifests repository?

@connorlwilkes Upstream manifest has KNative v0.14.3: https://github.com/kubeflow/manifests/blob/master/common/knative/knative-serving-crds/base/crd.yaml#L7. But KFserving requires KNative v0.17.4 or above: https://github.com/kubeflow/kfserving#prerequisites

@zijianjoy
Copy link
Collaborator Author

zijianjoy commented Apr 20, 2021

@zijianjoy Thought I’d link to my comment about the problem you were having with my cert-manager PR in case you missed it. kubeflow/manifests#1820 (comment)

@davidspek Thank you for the link! I make the changes below to the cert-manager manifest. It allows me to successfully deploy cert-manager and kubeflow-issuer:

  1. Move cluster-issuer.yaml to cert-manager/kubeflow-issuer/
  2. Update cert-manager/kubeflow-issuer/kustomization.yaml to add resources cluster-issuer.yaml
  3. Update cert-manager/overlays/self-signed/kustomization.yaml to use resources cert-manager/kubeflow-issue instead of cluster-issuer.yaml.
  4. More detail in this PR commit with this path: kubeflow/common/cert-manager/cert-manager-1-3

When applying these resource, I apply the cert-manager/base first, then apply the cert-manager/kubeflow-issuer. It avoids kustomize to apply the following labels to all cert-manager deployments:

commonLabels:
  kustomize.component: cert-manager
  app.kubernetes.io/component: cert-manager
  app.kubernetes.io/name: cert-manager

Otherwise, cert-manager service on cluster will be mapped to 3 deployments: cert-manager, cert-manager-cainjector, cert-manager-webhook, which is not right. cert-manager service should point to cert-manager deployment only. But overlays/self-signed enforced all these labels on base resource.

@zijianjoy zijianjoy requested a review from Bobgy April 21, 2021 05:54
Copy link
Contributor

@Bobgy Bobgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job!
/lgtm
/approve

@google-oss-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Bobgy, zijianjoy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@zijianjoy
Copy link
Collaborator Author

Thank you so much Yuan for your help!

/unhold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants