Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add event driven workflows runner #102

Merged
merged 1 commit into from
Aug 20, 2024

Conversation

agrare
Copy link
Member

@agrare agrare commented Jul 1, 2024

Add a Workflows::Runner class which keeps a persistent thread waiting for events from Floe. Once an event is raised we try to link the event up to a workflow and we queue a step action if we find one.

If a workflow isn't found for example if it is already completed and removed from the list then we simply continue without queueing anything.

It is common to get multiple events for a single container run so this isn't a perfect "queue step when ready" and can be more efficient but currently this is erring on the side of caution (check step_ready on every event).

Also if the worker goes down and is restarted for any reason, the typical timer-based step is still on the queue and the running workflow will be added to the runner on the next timed check.

Depends on:

Fixes #109

@miq-bot miq-bot added the wip label Jul 1, 2024
@agrare agrare force-pushed the add_event_driven_workflows_runner branch 12 times, most recently from 904527f to 36f4e3d Compare August 15, 2024 16:01
@agrare agrare changed the title [WIP] Add event driven workflows runner Add event driven workflows runner Aug 15, 2024
@miq-bot miq-bot removed the wip label Aug 15, 2024
Comment on lines +8 to +10
def runner
@runner ||= new.tap(&:start)
end
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE this ensures that this will work from the AutomationWorker as well as in development using simulate_queue_worker.

@agrare agrare force-pushed the add_event_driven_workflows_runner branch 3 times, most recently from d02ea7d to 720e212 Compare August 15, 2024 18:05
Comment on lines 38 to 40
return if workflows.key?(workflow.id)

workflows[workflow.id] = [workflow, queue_args]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does workflows[] ||= do the same thing with concurrent hashes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ||= works I can switch to that

@agrare agrare force-pushed the add_event_driven_workflows_runner branch from 720e212 to 7a0e8d8 Compare August 16, 2024 14:38
@miq-bot
Copy link
Member

miq-bot commented Aug 16, 2024

Checked commit agrare@7a0e8d8 with ruby 3.1.5, rubocop 1.56.3, haml-lint 0.51.0, and yamllint
4 files checked, 0 offenses detected
Everything looks fine. 🏆

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting this file to live on the floe side (or at least everything expect the runner method itself - that part I expected to be passed in).

That is, I expected some sort of generic queueing / pulling from a queue / checking strategy in floe, and then the specifics about how to queue something would come from manageiq-providers-workflows via callbacks or lambdas or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So aside from the boilerplate thread management, almost all of what is here is dealing with ConfigurationScript records and MiqQueue. We could move more into Floe but we'd have to have callbacks for almost everything in the docker_wait method (currently it is basically just docker_runner.wait, find the ConfigurationScript record, miq_queue.put).

if wf.end?
ManageIQ::Providers::Workflows::Runner.runner.delete_workflow(self)
else
deliver_on = wf.wait_until || 1.minute.from_now.utc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay

@kbrock
Copy link
Member

kbrock commented Aug 19, 2024

The whole multiple events and keying off of workflow has me a little.

I feel the purpose of this is to react to an event from Docker
So every time you throw in a task, you should be waiting for the event to end and you will trigger your own event accordingly. It may be "put this on the Queue" or "put this on the "MiqQueue". And at that time, you delete the entry in the local reaction workflows cache, since we already reacted and have no need to react further.

If you run 3 tasks for a workflow, then you get 3 "I am done" events out and kick the same workflow accordingly.

I feel the entry should go into this cache when we kick off a task. Maybe this is where my idea of keying off of a container_ref breaks down. The container ref and the fact that we're calling a Task will be very well known to the caller. Feels like we're loosing some encapsulation.

@kbrock
Copy link
Member

kbrock commented Aug 20, 2024

offline: agrare doesn't want to store container_ref in the cache, it is too volatile and would prefer to store a workflow.id or something that is more stable

@kbrock kbrock merged commit 572047c into ManageIQ:master Aug 20, 2024
2 of 4 checks passed
@agrare agrare deleted the add_event_driven_workflows_runner branch August 20, 2024 14:28
@Fryguy
Copy link
Member

Fryguy commented Aug 22, 2024

Backported to radjabov in commit ad9bc0b.

commit ad9bc0b2a3b19bb6c8ad9d5352efc3fe772a5235
Author: Keenan Brock <[email protected]>
Date:   Tue Aug 20 10:27:18 2024 -0400

    Merge pull request #102 from agrare/add_event_driven_workflows_runner
    
    Add event driven workflows runner
    
    (cherry picked from commit 572047c0e74ce12269dffbac0520134804c8f1c4)

Fryguy pushed a commit that referenced this pull request Aug 22, 2024
Add event driven workflows runner

(cherry picked from commit 572047c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use Floe wait for event driven updates to workflows
4 participants