-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements for Queue#processStalledJobs #311
Improvements for Queue#processStalledJobs #311
Conversation
@@ -47,7 +47,7 @@ var LOCK_RENEW_TIME = 5000; // 5 seconds is the renew time. | |||
var CLIENT_CLOSE_TIMEOUT_MS = 5000; | |||
var POLLING_INTERVAL = 5000; | |||
|
|||
var Queue = function Queue(name, redisPort, redisHost, redisOptions){ | |||
var Queue = function Queue(name, redisPort, redisHost, redisOptions, queueOptions){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs readme update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bull's documentation is currently outdated in many regards, I'm addressing this: #309 by documenting everything in jsdoc format, if documentation lives closer to the code, it's much more likely it will be up to date. There are many helpers that generate nice looking documentation, I particularly liked this one: https://camo.githubusercontent.com/724b9224844b6b4f2cd19b3bce8d25015fa54cfa/687474703a2f2f7075752e73682f674f794e652f363663336164636239372e706e67
In any case, yes, this needs a readme update
thanks for the PR. I am still not convinced that adding this extra complexity is needed. |
The processStalledJobs functionality is opinionated by assuming that it is completely fine to restart a job in all usage cases, when that's certainly not what you might want on several use cases, specially those where the job handler has side effects that you don't want to accidentally repeat multiple times. The limit option is useful when you do not want to clog your application by processing stalled jobs - if there are hundreds of thousands of jobs to retry, this means that a single call to |
@chuym ok. but if you disable the process stalled jobs, the active queue will grow overtime and you will not get any notification or explanation why these jobs have not completed. |
If stalled jobs cannot be re-executed I think it makes more sense to make them fail and put as reason "job stalled". |
@@ -423,7 +430,10 @@ Queue.prototype.run = function(concurrency){ | |||
var promises = []; | |||
var _this = this; | |||
|
|||
return this.processStalledJobs().then(function(){ | |||
// In case of connection loss, running `processStalledJobs` will repick jobs here. | |||
var start = this.opts.processStalledJobs ? this.processStalledJobs() : Promise.resolve(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still needed to be run when the queue starts, instead of just letting it start after _this.LOCK_RENEW_TIME
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expectations of Bull is to process them immediately on reconnection. Not calling it to begin with result in many failures across tests.
Good point, yet, we do not have a periodic An unbounded In fact, I think this can be further improved by both moving this |
having an optimized getNextStalledJob should be better than an extra limit parameter that requires finetuning by the user. I can work on that. |
My only reservation is the potential flakiness of I think
Item 1 above is not addressable, there isn't a safe way to tell if a handler effectively stalled, this is a limitation of Javascript and I don't think I understand how useful |
1d471ec
to
e3051f8
Compare
Hey @manast so what is your take on this? Should this be dropped on favor of improving and optimizing |
I would like to discuss this feature a bit more so that we are on sync on what is the best approach. There are a few things I would like to decide:
|
btw, moving stalled/stuck jobs back to the wait queue as part of the retry mechanism would not be only an elegant approach but even better performance wise when you have a lot of elements in the active list in a large setup. |
Implemented in #359 |
This PR makes Queue#processStalledJobs more flexible by implementing the following 2 items:
Queue
now takes aqueueOptions
hash that, currently, has only one recognized option:processStalledJobs
. When set tofalse
, the queue won't automatically process stalled jobs.Queue#processStalledJobs
itself accepts an optional parameter:limit
which will only process that many jobs at a time.