Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map helm lifecycle hooks to ArgoCD pre/post/sync hooks #355

Closed
jessesuen opened this issue Jul 6, 2018 · 30 comments
Closed

Map helm lifecycle hooks to ArgoCD pre/post/sync hooks #355

jessesuen opened this issue Jul 6, 2018 · 30 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@jessesuen
Copy link
Member

The anchor feature in ksonnet v0.12 is the support of helm registries and chart installations. e.g.:

ks registry add helm-stable https://kubernetes-charts.storage.googleapis.com
ks pkg install helm-stable/redis
ks generate helm-stable-redis hc1

After we upgrade ArgoCD to use ksonnet v0.12, and do a ks show, we will start getting helm hook resources. Since ArgoCD is handling the apply, we will need to build in helm knowledge to skip over hooks during the sync (ksonnet is doing this as well during a ks apply).

Additionally helm supports the following hooks:
https://github.com/kubernetes/helm/blob/master/docs/charts_hooks.md

  • pre-install: Executes after templates are rendered, but before any resources are created in Kubernetes.
  • post-install: Executes after all resources are loaded into Kubernetes
  • pre-delete: Executes on a deletion request before any resources are deleted from Kubernetes.
  • post-delete: Executes on a deletion request after all of the release's resources have been deleted.
  • pre-upgrade: Executes on an upgrade request after templates are rendered, but before any resources are loaded into Kubernetes (e.g. before a Kubernetes apply operation).
  • post-upgrade: Executes on an upgrade after all resources have been upgraded.
  • pre-rollback: Executes on a rollback request after templates are rendered, but before any resources have been rolled back.
  • post-rollback: Executes on a rollback request after all resources have been modified.
  • crd-install: Adds CRD resources before any other checks are run. This is used only on CRD definitions that are used by other manifests in the chart.

As part of an ArgoCD sync, at the minimum, these resources should be ignored. It can be a feature to map the helm hooks as ArgoCD hooks. (e.g. pre-upgrade and post-upgrade could map to PreSync and PostSync).

@jessesuen jessesuen added this to the 0.7.0 milestone Jul 6, 2018
@jessesuen
Copy link
Member Author

Also need to exclude test hooks:
https://github.com/kubernetes/helm/blob/master/docs/chart_tests.md

@jessesuen jessesuen removed this from the 0.7.0 milestone Jul 18, 2018
@jessesuen jessesuen added the P1 label Jul 18, 2018
@jessesuen jessesuen changed the title Ksonnet v0.12 / Helm support Map helm lifecycle hooks to ArgoCD pre/post/sync hooks Jul 25, 2018
@jessesuen jessesuen added P2 and removed P1 labels Aug 16, 2018
@jessesuen jessesuen added P3 and removed P2 labels Sep 21, 2018
@jessesuen
Copy link
Member Author

jessesuen commented Sep 21, 2018

It turns out helm hooks are not very widely used, at least not in the public charts. Here are the charts I found using it:

$ grep -r 'helm.sh/hook' . | grep -v test-success | grep -v hook-weight | grep -v hook-delete-policy | grep -v crd-install
./stable/tensorflow-notebook/templates/secrets.yaml:    "helm.sh/hook": pre-install,pre-upgrade
./stable/etcd-operator/templates/etcd-cluster-crd.yaml:    "helm.sh/hook": "post-install"
./stable/etcd-operator/templates/backup-etcd-crd.yaml:    "helm.sh/hook": "post-install"
./stable/etcd-operator/templates/restore-etcd-crd.yaml:    "helm.sh/hook": "post-install"
./stable/prometheus-operator/templates/prometheus-operator/cleanup-crds.yaml:    "helm.sh/hook": pre-delete
./stable/kong/templates/migrations-on-upgrade.yaml:    helm.sh/hook: "pre-upgrade"
./stable/spark-history-server/templates/cleanup-job.yaml:    "helm.sh/hook": pre-delete
./stable/minio/templates/post-install-create-bucket-job.yaml:    "helm.sh/hook": post-install,post-upgrade
./stable/influxdb/templates/post-install-set-auth.yaml:    "helm.sh/hook": post-install
./stable/spinnaker/templates/hooks/install-using-hal.yaml:    "helm.sh/hook": "post-install,post-upgrade"
./stable/spinnaker/templates/hooks/expose-nodeports.yaml:    "helm.sh/hook": "post-install, post-upgrade"
./stable/spinnaker/templates/hooks/cleanup.yaml:    "helm.sh/hook": "pre-delete"
./stable/dex/templates/job-web-certs.yaml:    "helm.sh/hook": post-install
./stable/dex/templates/job-grpc-certs.yaml:    "helm.sh/hook": post-install
./stable/dex/templates/config-openssl.yaml:    "helm.sh/hook": post-install
./stable/sumologic-fluentd/templates/secrets.yaml:    "helm.sh/hook": pre-install,pre-upgrade
./stable/tensorflow-serving/templates/pvc.yaml:    "helm.sh/hook": pre-install
./stable/ark/templates/hook-deploy.yaml:    "helm.sh/hook": post-install
./stable/ark/templates/hook-delete.yaml:    "helm.sh/hook": pre-delete
./stable/mission-control/templates/hooks/create-user.yaml:    "helm.sh/hook": post-install
./stable/sentry/templates/hooks/db-init.job.yaml:    "helm.sh/hook": post-install
./stable/sentry/templates/hooks/user-create.job.yaml:    "helm.sh/hook": post-install
./stable/sumokube/templates/secrets.yaml:    "helm.sh/hook": pre-install,pre-upgrade
./stable/sumokube/templates/config.yaml:    "helm.sh/hook": pre-install,pre-upgrade
./stable/traefik/templates/storeconfig-job.yaml:    "helm.sh/hook": post-install
./incubator/zookeeper/templates/job-chroots.yaml:    "helm.sh/hook": post-install,post-upgrade
./incubator/sparkoperator/templates/webhook-cleanup-job.yaml:    "helm.sh/hook": pre-delete
./incubator/sparkoperator/templates/webhook-init-job.yaml:    "helm.sh/hook": post-install
./incubator/common/README.md:    "helm.sh/hook": "pre-install"
./incubator/common/README.md:"helm.sh/hook": "pre-install,post-install"
./incubator/common/templates/_metadata_annotations.tpl:"helm.sh/hook": {{printf "%s" . | quote}}

Deprioritizing this.

@alexmt
Copy link
Collaborator

alexmt commented Nov 16, 2018

The 'crd-install' install hook is getting popular. I guess because this is very convenient way to handle CRDs in helm. We should support this one hook at least.

grep -r 'crd-install' .                                                                                                                                                             (docker-for-desktop/default)

./stable/prometheus-operator/templates/prometheus-operator/crd-alertmanager.yaml:    "helm.sh/hook": crd-install
./stable/prometheus-operator/templates/prometheus-operator/crd-servicemonitor.yaml:    "helm.sh/hook": crd-install
./stable/prometheus-operator/templates/prometheus-operator/crd-prometheus.yaml:    "helm.sh/hook": crd-install
./stable/prometheus-operator/templates/prometheus-operator/crd-prometheusrules.yaml:    "helm.sh/hook": crd-install
./stable/cert-manager/templates/clusterissuer-crd.yaml:    "helm.sh/hook": crd-install
./stable/cert-manager/templates/certificate-crd.yaml:    "helm.sh/hook": crd-install
./stable/cert-manager/templates/issuer-crd.yaml:    "helm.sh/hook": crd-install
./stable/jaeger-operator/templates/crd.yaml:    "helm.sh/hook": crd-install

#792

@kuznero
Copy link

kuznero commented Nov 19, 2018

What would be the workaround to this?

@alexmt
Copy link
Collaborator

alexmt commented Nov 19, 2018

The 'crd-install' is solved by 275b9e1 . Only workaround right now is to remove 'crd-install' manually.

@kuznero is any other helm hook important for you ?

@kuznero
Copy link

kuznero commented Nov 19, 2018

@alexmt not that I know of for now.

@asvasyanin
Copy link

pre-install hooks in combination with hook-weight are very useful, for example in case when we need to create pvc and then do something with persistent volume during installation (but not when we upgrading app). Same could be achieved with init containers and scripting, but it's more complicated, especially when hooks must run in certain order

@bstascavage
Copy link

This is also important for the influxdb helm chart. It uses a post-install hook to set the admin user for http auth

@alexec alexec added the 3 label Mar 21, 2019
@alexec alexec modified the milestones: v0.13, v0.14 and beyond Mar 21, 2019
@alexec alexec removed this from the Beyond v1.0 milestone Apr 5, 2019
@jessesuen
Copy link
Member Author

The difficulty is that, none of these hooks (pre-upgrade, post-upgrade, pre-install, post-install, pre-delete, post-delete) actually map cleanly to any concepts in Argo CD. Even pre-upgrade does not map cleanly to PreSync, because pre-upgrade is not executed during the initial deploy.

In order to support helm hooks properly, we would likely need to introduce granular Application phases/state. All of these imperative executions of hooks are somewhat antithetical to gitops style of deployments where desired state is stored in git, which makes the implementation of this a bit involved.

@alexec alexec added the enhancement New feature or request label Apr 11, 2019
@alexec
Copy link
Contributor

alexec commented Apr 11, 2019

Could we document work-arounds?

@jessesuen
Copy link
Member Author

Here's where I stand on this, after studying the use of these hooks in public helm charts.

For a first pass at this, I think the following mapping might solve 70% of peoples use cases:

Helm Annotation Argo CD Annotation
helm.sh/hook-weight argocd.argoproj.io/sync-wave
helm.sh/hook: pre-upgrade argocd.argoproj.io/hook: PreSync
helm.sh/hook: pre-install argocd.argoproj.io/hook: PreSync
helm.sh/hook: post-install argocd.argoproj.io/hook: PostSync
helm.sh/hook: post-upgrade argocd.argoproj.io/hook: PostSync

It's not a 100% clean/accurate mapping, but based on what we observe in public helm charts, it nearly matches the intent of these chart hooks. Even though the mapping is not clean, the key problem it avoids, is introducing application life-cycle state into Argo CD.

@stormmore
Copy link

Why do you have to map then rather than actually implementing them as they were designed? As you have pointed out the mapping isn't clean or accurate and thus isn't really what the ask is.

@alexec
Copy link
Contributor

alexec commented Jul 26, 2019

I've broken this ticket into 5 smaller tickets to make it easier to deliver incremental value. We can close this one when they are 4/5 are complete. #2038 is for some other time.

@jessesuen
Copy link
Member Author

Why do you have to map then rather than actually implementing them as they were designed? As you have pointed out the mapping isn't clean or accurate and thus isn't really what the ask is.

As I mentioned, the only way to support the exact meanings of helm hooks, we would need to introduce lifecycle state into an application. In other words, introduce/maintain a state machine of where the application is in its lifecycle. This state machine could easily become inaccurate (e.g. resources were existing and the argocd app was created to manage those resources after the fact).

We are trying to avoid introducing lifecycle state so the middle ground here is to honor a subset of the common concepts between helm/Argo CD.

@alexec alexec self-assigned this Aug 1, 2019
@alexec alexec closed this as completed in bad2e91 Aug 12, 2019
@jgoeres
Copy link

jgoeres commented Jun 12, 2020

I ran into this when trying to deploy the sparkoperator via ArgoCD. The installation of the webhook cleanup job installed as part of the sparkoperator got stuck and complains that it cannot be installed because it is unable to reach a service account that is also created by the helm chart.
So if I get the whole discussion here correctly, this is not going to be fixed?

@bgvladedivac
Copy link

hi guys, the spark operator exposes a few hooks with "helm.sh/hook": pre-delete annotation. I understand this is not supported by ArgoCD => https://argoproj.github.io/argo-cd/user-guide/helm/#helm-hooks, but the one of those hooks runs as a first resource to be created, essentially failing relying on an already

How does argo treats those annotation, does it change it to a default 'install' now ?

@abdennour
Copy link

abdennour commented Jan 23, 2021

@bgvladedivac Unfortunately, ArgoCD uses Helm just as YAML generator ( e.g. helm template | kubectl -f-) .
It's something similar when you have Car Mercedes Benz, but you are using it like a regular Car.

@pascalwhoop
Copy link

Just to add to the current situation, this means you cannot deploy airflow using helm with the official helm chart as of today
https://github.com/apache/airflow/search?q=helm.sh%2Fhook

Airflow gets stuck waiting for a postgressDB migration which never occurs because that relies on the hooks

@dnskr
Copy link
Contributor

dnskr commented Jun 6, 2021

@pascalwhoop Hi! I found the issue with Airflow helm chart some time ago. Could you please test my solution and may be leave some comments in the PR?

@thesuperzapper
Copy link

@pascalwhoop @dnskr for reference, the user-community airflow chart dose not have issues with argo-cd.

This is because it uses a Deployment rather than a Job to manage db-migrations (with the option of changing back to a post-install Job using the airflow.dbMigrations.runAsJob value).

@yehoshuadimarsky
Copy link

I have the exact same problem, I'm trying to use Argo to deploy the official Airflow Helm chart, and have the same issue where the containers wait forever for the database migration that never happens. Is there any solution?

@yehoshuadimarsky
Copy link

I have the exact same problem, I'm trying to use Argo to deploy the official Airflow Helm chart, and have the same issue where the containers wait forever for the database migration that never happens. Is there any solution?

I found the solutions in these PRs apache/airflow#16291 and apache/airflow#16331, they work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests