-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Argocd #763
Comments
+1 |
Which ArgoCD version are you running? https://argo-cd.readthedocs.io/en/stable/user-guide/resource_hooks/#hook-deletion-policies suggests that you may need to check with ArgoCD, since:
It's not entirely clear if they effectively honor other Helm hooks. We don't place any delete hooks on the init job because it only runs at install, whereas the others spawn every upgrade. argoproj/argo-cd#355 (comment) suggests that ArgoCD should mimic that behavior even though they use Even if they do ignore that, however, the default behavior described by the docs should handle the problem, since ArgoCD should just delete the old hook job before creating the new one, even if the delete policy doesn't specify. @fabry00 can you clarify what you did and why that required modifying other resources? Intuitively, even the above default functionality is broken, we'd work around it by adding the |
Dear @rainest , I am using ArgoCD 2.6. What I noticed is that there is a slightly different behavior between Argo's Hooks and Helm chart hooks. If I add the Hooks than the job run before everything else (even with HookSucceeded) and then It enter in error states because all it's dependencies cannot be found. In case I'll try to reproduce it again. cheers |
Yeah--seeing the actual failure state output from the Kubernetes API will help me understand exactly how it's breaking, so if you can reproduce problems with that annotation in place and post the output, that'll help us determine what changes the chart needs. Aside that, we do have a hidden toggle: you can set |
Hey @rainest, I have stumbled upon this issue recently when deploying kong helm chart with postgres DB enabled via ArgoCD v2.6.6 and can help with the needed investigation. There are 3 main issues when deploying kong helm chart via ArgoCD:
Otherwise the serviceAccount is not created on time and the migrations are not able to run successfully (ArgoCD always syncs up first everything which has hook annotations and then does rest of the manifests, same case as in the 1st point just for missing resources not the order in which they are created). The annotation above fixes the issue for missing service account for all of the migration jobs and forces to create the serviceAccount right on time. As you can see this fix I did spills now to all resources which are used by the migrations jobs that have helm hooks annotations attached, for example a Secret which is needed by the migrations jobs to access the DB. If any resource is used by the migration jobs it also needs the
Which brings up to the issue: there is no install/upgrade for ArgoCD. As you are probably aware, ArgoCD uses the
where values.yaml contain the necessary settings to enable postgres DB integration for kong. The diff shows that the second command did not render the init-migrations Job manifest and therefore it wold be removed before upgrading the release and all would work. In ArgoCD this is not the case because the flag The above is painful, especially if someone maintains his own kong image (based on the official one) and whenever we bump the image tag, we end up with a broken sync because ArgoCD tries modifying the image tag for the init-migrations job which as stated above is immutable. I wonder how this issue could be addressed because right now I am using the following workaround to get kong in a healthy state:
What could be done to address this:
I hope I explained everything clearly. In order to troubleshoot this issue it's best to setup an ArgoCD instance and start deploying kong helm chart on it. It clearly shows all of the issues you are facing one after another when trying to get this working. Let me know if you need more insights and for the time being one can use the above steps to make it work. |
@LeszekBlazewski Do you know of any reason you ever wouldn't want PreSync and BeforeHookCreation on the init job? If the issue is that it won't work via Job annotations because adding it there would then make the What remains unclear is the original report where apparently adding the delete annotation ostensibly did not work to clear the collision issue. @fabry00 I'm still unsure if that's indeed what you observed--did you maybe add it to Short of ArgoCD adding some better support for lifecycle to its Helm module, @LeszekBlazewski 's listed workaround (disabling jobs and setting Argo annotations manually) is essentially what we recommend. We can document that more clearly, since it seems more or less vetted across this and other issue reports. The reason the init Job does not have a hook is because of the built-in Postgres chart: Postgres can't spawn during pre-install, so we have to spawn the init Job during regular install for it to not fail. I would actually like to remove the Postgres subchart, as it's more trouble than it's worth (it imposes weird constraints like the above on chart development), but that requires approval beyond me and wouldn't happen any time soon if it is approved. We originally added it as a quick way to deploy simple self-contained demo installs, but recommend managing Postgres separately for any production instance. Now that we have DB-less support for most features, that's a more attractive demo install path, but removing the subchart would be a breaking change, so it's not something we can do quickly. The migrations commands are indeed all idempotent: if there's nothing for them to do, they simply exit successfully with a message saying they did nothing. It's fine if they get deployed unnecessarily. |
@rainest my main problem with the job annotations that it will force me to add the hooks as well for all the other dependencies. So to be able to have the job running with hooks I have to set the hooks as well for:
basically almost everything that Kong needs as a dependency to start. annotations:
....
argocd.argoproj.io/sync-wave: "-5" Example of my values.yaml admin_gui_url:
valueFrom:
configMapKeyRef:
name: myconfigmap
key: manager_ui_url
admin_api_uri:
valueFrom:
configMapKeyRef:
name: myconfigmap
key: admin_api_uri
portal_api_url:
valueFrom:
configMapKeyRef:
name: myconfigmap
key: portal_api_url
portal_gui_host:
valueFrom:
configMapKeyRef:
name: myconfigmap
key: portal_gui_host
portal_session_conf:
valueFrom:
secretKeyRef:
name: mysecret
key: portal_session_conf
password:
valueFrom:
secretKeyRef:
name: mysecret
key: password
pg_user:
valueFrom:
secretKeyRef:
name: mysecret
key: user
pg_password:
valueFrom:
secretKeyRef:
name: mysecret
key: password
pg_database:
valueFrom:
secretKeyRef:
name: mysecret
key: database
extraConfigMaps:
- name: root-ca-cert
mountPath: /etc/ca/
- name: ocp-ca-cert
mountPath: /etc/ocp/service-ca/
enterprise:
enabled: true
license_secret: mysecret
..... myconfigmap and mysecret are secrets and configmaps created by ExternalSecrets My helm chart is something like this: apiVersion: v2
appVersion: 1.0.0
name: my-kong-controlplane
dependencies:
- name: postgres
version: 1.0.0
repository: file:./charts/postgres
- name: pgpool
version: 1.0.0
repository: file:./charts/pgpool
- name: controlplane
version: 1.0.0
repository: file:./charts/controlplane-dependencies
- name: kong
version: 2.16.5
repository: https://artifactory...../helm-virtual
description: Kong Control Plane
engine: gotpl
version: 1.0.0 So all the charts postgres, pgpool and controlplane-dependencies (external secrets,routes,configmaps...) have to be ready before the job migration starts. |
Not much we can help with there I'm afraid--we can't really account for the surrounding environment, and the expectation of our chart is that the prerequisite resources are deployed separately. Unless Helm were to add some dependency management system (and Argo were to add support for it), the basic install order provided by hooks doesn't offer enough control to manage deploying all of those when they're needed. My best recommendation there is to split those charts, so that you can deploy the environment prerequisite components first, and the components using them after. |
I can't come up with a reason when I would like to not have the PreSync annotation on the init job assuming it runs only 1 time and during the first initial kong deployment. Let me run some tests based on the PreSync annotation added to the init job and I will get back to you with the results. |
@LeszekBlazewski checking in now that I'm back from vacation, did you see any issues or things that weren't addressed by those annotations? Moving this into our scheduled since I think the impact is valuable enough that we should just go ahead and try it, as well as start testing on ArgoCD internally if possible. |
Hi @rainest, I am really really sorry for the long reply. Been super busy recently. So we did a far amount of testing (both clean new deployments and updates to existing environments) with the kong helm chart (version 2.14.0) and it turned out that successful kong provisioning via ArgoCD requires the following:
The above would allow to sync the whole kong helm chart successfully and will execute the init db scripts.
and keep it that way at all times. I have ran different deployments and the above ensures that all of the migration jobs run properly in proper argocd sync stages when performing the sync operation. After all that testing, looks like kong works fine with ArgoCD, it's just the 2 step deployment process that is required due to the nature of how the job works. I think the it could be worth, improving the docs with specific argoCD section or gitOps section overall (that would include the deployment steps described by me above) + maybe documenting the hidden toggle of Do you want me to open a PR for the doc changes described above? |
Having to first set In my organization, Argo is setup to automatically create new apps (based on a git repo), so it's impractical to have to make that change manually. Unfortunately that's what we have to end up doing and every new application requires a manual step. I understand the |
So I tested the following configuraiton
The argo hooks do correctly delete the migration jobs (after sync is completed). So it at least allows the change of the The @rainest I'm assuming that there is not a whole lot that can be done to allow the pre-upgrade job to complete successfully when the init job has never been executed in the first place? Given the constraints that argo introduces, of course, where we cannot really differentiate install and upgrade syncs. |
We now have tooling to create Argo-managed test environments easily, so we can replicate issues and test fixes locally. 2.26 contains the results of our testing, which indicates the easiest way to handle the Jobs is to run both in the Sync phase, with a delete before creation. The init Job was already effectively running during Sync, but will no longer block because it's immutable. It'll be clutter after the initial install, but won't cause issues. Disabling it via values.yaml after install will avoid the clutter. We'd ideally like to run the upgrade job before patching the upgraded Deployment revision, but starting it at the same time doesn't cause any obvious issues in testing. The upgraded Pods will block initially until migrations complete, but will eventually come online. Sync waves would let us control the order, but I'm unsure if they're compatible with hooks (we still need to delete the Jobs), and they have the same limitation as the PreUpgrade hook, where we need to manage them on all prerequisite resources. Since making upgrade a Sync hook appears viable, we want to try that in the wild first. If we get further issue reports, we'll revisit more complex options. Please file a new issue if you're still seeing issues with upgrades, or other non-upgrade Argo issues, after upgrading to chart 2.26. |
We are deploying Kong using ArgoCD.
It's working ok but the only problem is with the migration jobs.
Since they do not get automatically deleted, the next time that I upgrade the chart, ArgoCD fails because the job fields are immutable
I tried to configure already ArgoCD hooks: argocd.argoproj.io/hook: PreSync and argocd.argoproj.io/hook-delete-policy: HookSucceeded but I had to revert it because I had to add it for all the dependency and as well on my custom resources deployed alongside Kong helm chart.
So far I prefer to keep the helm chart as it is and delete the job manually before upgrading it.
It would be great if somehow we could reshape the helm chart in a way to solve this issue.
Here an example of the logs (I upgraded the chart from 2.14 to 2.16)
Failed sync attempt to dad5bdcdd039d6e8585fe55297453aad4fd650d8: one or more objects failed to apply, reason: Job.batch "controlplane-dev-kong-init-migrations" is invalid: spec.template: Invalid value: ...: field is immutable (retried 5 times).
The text was updated successfully, but these errors were encountered: