Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency control for all queued jobs #366

Closed
antulik opened this issue Sep 8, 2021 · 5 comments
Closed

Concurrency control for all queued jobs #366

antulik opened this issue Sep 8, 2021 · 5 comments

Comments

@antulik
Copy link

antulik commented Sep 8, 2021

In #317 there was a discussion and enqueue_limit was changed to exclude performing tasks.

However, the new approach is still not covering all use cases. There are cases for both, and I think we should support both.

For example, when we DO want to include executing jobs in enqueue_limit is cron. Instead of manually configuring a single cron process, that could be managed with concurrency limit.

Given: job with enqueue_limit: 1, perform_limit: 1
When: 3 processes with cron are running
Currently: 3 jobs get queued, but because of perform_limit, two jobs fail with retry.
Expected: 1 job gets queued/executed

It's surprising to see 3 jobs queued with enqueue_limit: 1. It's even more surprising because the job count is larger than the total limit of 2 (enqueue + perform). (is that technically a race condition bug?)

While the cron logic could be improved to handle race conditions, there are cases outside cron. E.g. https://github.com/mhenrixon/sidekiq-unique-jobs handles both cases.

Proposal
Add a new good_job_control_concurrency_with config that will let the user configure the behaviour per job.

E.g. enqueue_includes_perform: true

@bensheldon
Copy link
Owner

@antulik thanks for opening this issue. I think you're right that GoodJob should have a concurrency configuration that is inclusive of both enqueue and perform.

What do you think of simplifying the interface like this:

# allow only a single unfinished job
good_job_control_concurrency_with(limit: 1)

# separately limit enqueued jobs and performing jobs
good_job_control_concurrency_with(enqueue_limit: 1, perform_limit: 1)

I can make that change.

Also, I've been thinking a lot about cron too and wonder if it needs its own different locking strategy. Cron is challenging if the underlying job is not idempotent, it seems like it needs to have a lock on both(cron_key, cron_scheduled_at) that is somewhat permanent (a uniqueness validation in the database rather than just an advisory lock).

@antulik
Copy link
Author

antulik commented Sep 9, 2021

# allow only a single unfinished job
good_job_control_concurrency_with(limit: 1)

I love it. That's the best I've seen so far and better than my proposal or sidekiq-unique-jobs approach. Consider total_limit to be more explicit and to be inline with other names.

Also, I've been thinking a lot about cron too and wonder if it needs its own different locking strategy. Cron is challenging if the underlying job is not idempotent, it seems like it needs to have a lock on both(cron_key, cron_scheduled_at) that is somewhat permanent (a uniqueness validation in the database rather than just an advisory lock).

Yes, I think ideally cron needs to be more reliable. However, new concurrency gives a workaround in a short term. I was thinking the same to have a lock on (cron_key, cron_scheduled_at). There is an edge case though. If the job's execution is instant and one of the processes is delayed, it might queue cron again.

preserve_job_records feature solves that edge case and provides 100% guarantee when needed.

It's surprising to see 3 jobs queued with enqueue_limit: 1. It's even more surprising because the job count is larger than the total limit of 2 (enqueue + perform). (is that technically a race condition bug?)

I suspect that's a bug, would you like me to raise it as a separate ticket?

@bensheldon
Copy link
Owner

@antulik I just pushed up a PR (#369) to add total_limit: as an option. Thank you for the naming suggestion 🙌

If the job's execution is instant and one of the processes is delayed, it might queue cron again.

Excellent point. Maybe it needs to store the timestamp in the database and have a validation on the value. I'll think about that more.

I suspect that's a bug, would you like me to raise it as a separate ticket?

Let's see if the new config option fixes the problems you're seeing. I'm a bit stumped to be honest.

@bensheldon
Copy link
Owner

I released total_limit: option in v2.1.0.

@antulik antulik changed the title Add config for concurrency control mode Concurrency control for both all queued jobs Sep 15, 2021
@antulik
Copy link
Author

antulik commented Sep 15, 2021

Let's see if the new config option fixes the problems you're seeing. I'm a bit stumped to be honest.

I raised a separate issue for that. #378

As for this issue, I think it's done. Thank you 🙇‍♂️

@antulik antulik closed this as completed Sep 15, 2021
@antulik antulik changed the title Concurrency control for both all queued jobs Concurrency control for all queued jobs Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants