-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive requests to git repository on commit to monorepo w/ multi-source apps #14725
Comments
I've created a public repo that also exhibits this issue: https://bitbucket.org/sam-lalonde/argocd-14725. I created the app of apps defined in getting manifests cache: 83 From my monitoring I can see: That's quite a few git requests for just 10 apps (and I'm only updating 1), I'm hoping to run 500+ apps. I don't exactly know how the code behaves, but it seems like when Argo receives the webhook from Bitbucket it should:
Currently when the webhook is received all apps are simultaneously clobbering Bitbucket. |
Thanks @de-slalonde for the detailed explanation and an example. One thing I noticed is that we are using the same repository with same TargetRevision ( The point to consider here - argocd should have used the cache, once updated by any app, because the Repo and the Revision( Still investigating 🤔 🔍 🔬 |
I'd be very surprised if the issue were the app-of-apps. I think our investigation showed that most of the attempts to hit the cache are from the |
@crenshaw-dev Once the repo has been pulled, and the cache has been updated, does Argo then have the information that wasn't available in the Bitbucket webhook payload to help it know what has actually changed so that unaffected applications can be just "nudged to the next revision" (or whatever that was you mentioned in the meeting). |
Even though it probably has access to the necessary info, the repo-server doesn't do git diff analysis to determine whether it should rebuild manifests. If there's a new commit, it rebuilds. There is an open issue to change this. In other words, we should expect plenty of |
Are you using webhooks by the way and path based computation to decide if cache should be used or not as last time (2 months I tested in Bitbucket it wasn't working at all). |
@de-slalonde @ishitasequeira @crenshaw-dev I recently encountered a similar issue, using a monorepo w/multi-source apps generated by AppSet. The repo-server logs showed mostly cache miss, something like:
Looking at the cache miss logs, most have sources:
- directory:
exclude: * #necessary so that argocd doesn't consume and try to manage the argocd-able files in the path
path: my-app/
repoURL: https://REPO_URL
targetRevision: HEAD
ref: values
- chart: CHART_NAME
repoURL: CHART_URL
targetRevision: v0.0.1
helm:
version: v3
releaseName: RELEASE_NAME
valueFiles:
- $values/my-app/values.yaml After doing this, our repo-server now shows 100% cache hit:
Our repo-server then shows an almost-total decrease of the |
I'm sitting at my desk chuckling at this workaround. That's an incredible find. Makes me think there must be something wrong with how we populate the cache when the source is a values-only source (i.e. doesn't contribute any manifests to the Application). Adding a no-manifests manifest source must force it to be cached properly. |
We also have encountered this issue and the workaround provided here worked for us. |
+1 to this issue. In my case we have a mono git repo and multi apps/charts with a dependency chart. The above work around didnot fix the issue for us. We are still facing ~12k git requests per minute. also in our case the requests were"Cache Hit" and still causing rate limit details are here |
@tribu Are you still seeing both |
In my case I was able to use the concepts from that workaround to cut down on my |
I feel there is something similar in regards to the behavior of ls-remote. I have effectively 5 mono repos. Looking at the ls-remote metrics for each repo we see (per hour)
all of these mono repos, except repo4 have the same number of applications defined to pull values files from these repos. repo4 has 4x the amount of apps defined. in the case of repo1 the file is a singular static file across all apps. eg.
In the case of repo2 and repo3, they use a variable based on the cluster name. eg.
In the case of repo4, they use a variable like repo2/3, but on 4 of the clusters, there is an additional variable
and finally in the case of repo5, each of the 4 apps (per cluster) references a different static file. This one I find the most interesting as I would simply expect this to produce 4x the volume of ls-remotes as repo1, not 60x. eg
app2
The other thing that may be of interest is that on the apps for repo4/5 which have the huge volume of requests, we utilize ignoreDifferences, which we do not do on any other applications. |
Same here. Multi source apps with an OCI registry as a source for helm charts and a single Git repository as a source for configuration values. In my case Argo is short polling every 3 minutes; no webhooks configured. I can see excessive number of git requests not just on a new commit, but for every 3 minutes. |
Can confirm we are also experiencing the same behaviour deploying from a mono-repo hosting lots of values files separate from the git repo we deploy the helm chart from. We added all the files we care about to a section like below and can confirm we are still getting dozens of cache misses a second. argo version: 2.8.4+c279299
|
…unManifestGenAsync not using cache (Issue #14725) (#16410) * fix(repo-server): excess git requests part 1, resolveReferencedSources and runManifestGenAsync Signed-off-by: nromriell <[email protected]> * fix: remove unnecessary settings instantiation Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]>
…unManifestGenAsync not using cache (Issue #14725) (#16410) * fix(repo-server): excess git requests part 1, resolveReferencedSources and runManifestGenAsync Signed-off-by: nromriell <[email protected]> * fix: remove unnecessary settings instantiation Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]>
…unManifestGenAsync not using cache (Issue #14725) (#16410) (#16494) * fix(repo-server): excess git requests part 1, resolveReferencedSources and runManifestGenAsync * fix: remove unnecessary settings instantiation --------- Signed-off-by: nromriell <[email protected]> Co-authored-by: Nathan Romriell <[email protected]>
…unManifestGenAsync not using cache (Issue argoproj#14725) (argoproj#16410) * fix(repo-server): excess git requests part 1, resolveReferencedSources and runManifestGenAsync Signed-off-by: nromriell <[email protected]> * fix: remove unnecessary settings instantiation Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]> Signed-off-by: irinam0992 <[email protected]>
…unManifestGenAsync not using cache (Issue argoproj#14725) (argoproj#16410) * fix(repo-server): excess git requests part 1, resolveReferencedSources and runManifestGenAsync Signed-off-by: nromriell <[email protected]> * fix: remove unnecessary settings instantiation Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]>
…unManifestGenAsync not using cache (Issue argoproj#14725) (argoproj#16410) * fix(repo-server): excess git requests part 1, resolveReferencedSources and runManifestGenAsync Signed-off-by: nromriell <[email protected]> * fix: remove unnecessary settings instantiation Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]>
…unManifestGenAsync not using cache (Issue argoproj#14725) (argoproj#16410) * fix(repo-server): excess git requests part 1, resolveReferencedSources and runManifestGenAsync Signed-off-by: nromriell <[email protected]> * fix: remove unnecessary settings instantiation Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]>
…ions (Issue #14725) (#17109) * fix(repo-server): excess git requests, cache lock on revisions Signed-off-by: nromriell <[email protected]> * fix: pr feedback, simplify, add configurable variable Signed-off-by: nromriell <[email protected]> * fix: codegen, lint Signed-off-by: nromriell <[email protected]> * fix: test print, no opts set, var type nit Signed-off-by: nromriell <[email protected]> * chore: add additional logging for unexpected cache error Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]> Co-authored-by: Ishita Sequeira <[email protected]>
…unManifestGenAsync not using cache (Issue argoproj#14725) (argoproj#16410) * fix(repo-server): excess git requests part 1, resolveReferencedSources and runManifestGenAsync Signed-off-by: nromriell <[email protected]> * fix: remove unnecessary settings instantiation Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]> Signed-off-by: Kevin Lyda <[email protected]>
…ions (Issue argoproj#14725) (argoproj#17109) * fix(repo-server): excess git requests, cache lock on revisions Signed-off-by: nromriell <[email protected]> * fix: pr feedback, simplify, add configurable variable Signed-off-by: nromriell <[email protected]> * fix: codegen, lint Signed-off-by: nromriell <[email protected]> * fix: test print, no opts set, var type nit Signed-off-by: nromriell <[email protected]> * chore: add additional logging for unexpected cache error Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]> Co-authored-by: Ishita Sequeira <[email protected]> Signed-off-by: Kevin Lyda <[email protected]>
…ions (Issue argoproj#14725) (argoproj#17109) * fix(repo-server): excess git requests, cache lock on revisions Signed-off-by: nromriell <[email protected]> * fix: pr feedback, simplify, add configurable variable Signed-off-by: nromriell <[email protected]> * fix: codegen, lint Signed-off-by: nromriell <[email protected]> * fix: test print, no opts set, var type nit Signed-off-by: nromriell <[email protected]> * chore: add additional logging for unexpected cache error Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]> Co-authored-by: Ishita Sequeira <[email protected]>
…unManifestGenAsync not using cache (Issue argoproj#14725) (argoproj#16410) * fix(repo-server): excess git requests part 1, resolveReferencedSources and runManifestGenAsync Signed-off-by: nromriell <[email protected]> * fix: remove unnecessary settings instantiation Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]>
…ions (Issue argoproj#14725) (argoproj#17109) * fix(repo-server): excess git requests, cache lock on revisions Signed-off-by: nromriell <[email protected]> * fix: pr feedback, simplify, add configurable variable Signed-off-by: nromriell <[email protected]> * fix: codegen, lint Signed-off-by: nromriell <[email protected]> * fix: test print, no opts set, var type nit Signed-off-by: nromriell <[email protected]> * chore: add additional logging for unexpected cache error Signed-off-by: nromriell <[email protected]> --------- Signed-off-by: nromriell <[email protected]> Co-authored-by: Ishita Sequeira <[email protected]>
Checklist:
argocd version
.Describe the bug
With 71 multi-source applications in a monorepo (in Bitbucket), a single commit triggers 271 requests to the git repository (11 fetch, 260 ls-remote). Logs show 333 "getting manifest cache" entries, with 15 cache hits and 318 cache misses. Example application sources:
To Reproduce
Make any commit to the repository.
Expected behavior
Far fewer requests to the git repository and more cache hits.
Screenshots
N/A
Version
Logs
In depth discussion with @crenshaw-dev on Slack with more detail. https://cloud-native.slack.com/archives/C01TSERG0KZ/p1690317861587929
Discussed at Argo SIG Scalability meeting on July and @ishitasequeira volunteered to take a look.
The text was updated successfully, but these errors were encountered: