-
-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Materialized CTE performance bottleneck #720
Comments
@mdavidn Thank you so much for opening this issue 🙏🏻 I've known GoodJob had not great performance characteristics at large numbers of jobs and was waiting to see if anyone was actually pushing those limits. I really appreciate you digging into the solutions too. I'm thinking there are some quick improvements here:
|
I like the configuration with a moderately high default. There are better places to tune the maximum number of workers in large deployments, like in Terraform. |
@mdavidn I think this has been addressed by those two PRs (#726, #727) from @mitchellhenke 🎉 |
I wanted to document a performance bottleneck I encountered when enqueuing approximately 150,000 small, low-priority jobs at once with 20 workers. The scope
GoodJob::Lockable.advisory_lock
materializes a CTE that sorts and returns a list of all candidate jobs. This query takes about three seconds to lock each job under these conditions. This pegs the database CPU.The performance of this query can be dramatically improved, to 0.1 ms, by making two changes:
USING btree (priority DESC NULLS LAST, created_at ASC) WHERE finished_at IS NULL
The number of workers can change over time but is generally stable. The number could be cached in each process, perhaps refreshing at some interval after the query returns no available jobs.
The text was updated successfully, but these errors were encountered: