This repository has been archived by the owner on Dec 13, 2023. It is now read-only.
Optimizing traffic to datastore when sweeping workflows from the decider queue #3841
wildMythicWest
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We recently had problems with exceeding the network bandwith to our datastore (redis hosted on aws). This was very peculiar and we started an investigation into what is causing the huge traffic. One thing we discovered was that the redis bandwith IN was a lot lower than the redis bandwith OUT. While investigating the WorkflowSweeper we found out that on each sweep we were calling the datastore to fetch the workflow twice.
In our production environment we are sometimes passing 100k workflows through conductor for a short time (10 minutes).
All of the workflows are executing as standalone (the workflows are not persisted in conductor for multiple executions via reference).
For each workflow execution it seemed that we are unnecessarily fetching it a second time from the datastore for each sweep.
One thing we did to reduce the load on redis was to configure the number of sweeper threads.
I am still curious if there is a reason to call the datastore twice on each sweep?
I have opened a PR to address this - #3816
Any feedback is well appreciated!
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions