queue-manager: reorganize into strategies #2

vsoch · 2024-07-28T06:52:17Z

I realized that we need flexibility in defining queue strategies, not just in how the worker is designed, but also how the queue strategy handles the schedule function. This is an overhaul (not quite done yet) that does that. I stil need to plug the final query back in to move provisional to the worker queue. Also note that it looks like we have priority, pending, and other insert params to play with. And since things get lost in slack, here is a visual of the design:

The queue strategy I'm starting with is FCFS with backfill, which is (sort of) what Kubernetes can do, assuming it would schedule groups without clogging (allowing smaller groups that can be scheduled to fill in). This work is almost done - I need to finish the query to select the provisional pods that have groups at quorum, and then add them to the worker queues. I've already tested this step - once a group hits the worker queue, at least for this strategy, that is where we call "AskFlux" to do an allocation. It's FCFS with backfill because that allocation request can be denied if resources aren't ready, the job will go back into the queue, and the next group will be retried.

The events (subscriptions) are also working, and by updating args with the node assignment this is how we will send the signal back to the scheduler, and then call the binding. I haven't yet removed the original fluence in tree design, but that is happening slowly, and when the functionality is fully working here, I will remove it entirely in favor of that. I will need to think about how to properly handle current in tree plugins, because two different scheduling strategies doesn't make sense. My hope is that I can move the functionality of current (essential) in tree plugins to work in our new framework, whatever that might look like. 👀

Note that this branch goes into another branch that doesn't have a PR open yet.

Needs before merge here

Strategy to send node list and job id back to subscribers (scheduler)
Update queryReady query to select only provisional pods for which groups are fully assembled

I realized that we need flexibility in defining queue strategies, not just in how the worker is designed, but also how the queue strategy handles the schedule function. This is an overhaul (not quite done yet) that does that. I stil need to plug the final query back in to move provisional to the worker queue. Also note that it looks like we have priority, pending, and other insert params to play with. Signed-off-by: vsoch <[email protected]>

This changeset includes a query that will update Args (node) from within a worker job so we can send them back to the scheduler. I am lastly working on the command so that the initial query will move provisional pods (and groups) from the provisional table to the worker queue Signed-off-by: vsoch <[email protected]>

vsoch force-pushed the reorganize-queue-manager branch from 98e2c08 to 69e7624 Compare July 28, 2024 08:54

vsoch mentioned this pull request Jul 28, 2024

best strategy to update job args (or other metadata) to send to subscription? riverqueue/river#474

Closed

vsoch force-pushed the reorganize-queue-manager branch from 0ca0ce8 to 4ef00c2 Compare July 29, 2024 06:52

vsoch force-pushed the reorganize-queue-manager branch from 4ef00c2 to 73d587d Compare July 29, 2024 06:55

vsoch merged commit 7add491 into add-queue-and-gut-out Jul 29, 2024

vsoch deleted the reorganize-queue-manager branch July 29, 2024 07:06

vsoch restored the reorganize-queue-manager branch July 29, 2024 07:06

vsoch mentioned this pull request Jul 29, 2024

Add queue and gut out #5

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

queue-manager: reorganize into strategies #2

queue-manager: reorganize into strategies #2

vsoch commented Jul 28, 2024 •

edited

Loading

queue-manager: reorganize into strategies #2

queue-manager: reorganize into strategies #2

Conversation

vsoch commented Jul 28, 2024 • edited Loading

Needs before merge here

vsoch commented Jul 28, 2024 •

edited

Loading