Optimizing traffic to datastore when sweeping workflows from the decider queue #3841

wildMythicWest · 2023-11-09T15:35:50Z

wildMythicWest
Nov 9, 2023

We recently had problems with exceeding the network bandwith to our datastore (redis hosted on aws). This was very peculiar and we started an investigation into what is causing the huge traffic. One thing we discovered was that the redis bandwith IN was a lot lower than the redis bandwith OUT. While investigating the WorkflowSweeper we found out that on each sweep we were calling the datastore to fetch the workflow twice.
In our production environment we are sometimes passing 100k workflows through conductor for a short time (10 minutes).
All of the workflows are executing as standalone (the workflows are not persisted in conductor for multiple executions via reference).
For each workflow execution it seemed that we are unnecessarily fetching it a second time from the datastore for each sweep.
One thing we did to reduce the load on redis was to configure the number of sweeper threads.
I am still curious if there is a reason to call the datastore twice on each sweep?

I have opened a PR to address this - #3816

Any feedback is well appreciated!
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizing traffic to datastore when sweeping workflows from the decider queue #3841

{{title}}

Replies: 0 comments

Select a reply

Optimizing traffic to datastore when sweeping workflows from the decider queue #3841

wildMythicWest Nov 9, 2023

Replies: 0 comments

wildMythicWest
Nov 9, 2023