Skip to content

Commit

Permalink
fixes design flaw of job dispatching; this requires a change in datab…
Browse files Browse the repository at this point in the history
…ase migration
  • Loading branch information
vtt committed Mar 17, 2021
1 parent ea1afab commit 73a1d0e
Show file tree
Hide file tree
Showing 7 changed files with 60 additions and 29 deletions.
51 changes: 39 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,16 @@

Skiplock is a background job queuing system that improves the performance and reliability of the job executions while providing the same ACID guarantees as the rest of your data. It is designed for Active Jobs with Ruby on Rails using PostgreSQL database adapter, but it can be modified to work with other frameworks easily.

It only uses the **LISTEN/NOTIFY/SKIP LOCKED** features provided natively on PostgreSQL 9.2+ to efficiently and reliably dispatch jobs to worker processes and threads ensuring that a job can be completed successfully only once. No other polling or timer is needed.
It only uses the `LISTEN/NOTIFY/SKIP LOCKED` features provided natively on PostgreSQL 9.5+ to efficiently and reliably dispatch jobs to worker processes and threads ensuring that each job can be completed successfully **only once**. No other polling or timer is needed.

The library is quite small compared to other PostgreSQL job queues (eg. *delay_job*, *queue_classic*, *que*, *good_job*) with less than 400 lines of codes; and it still provides similar set of features and more...

#### Compatibility:

- MRI Ruby 2.5+
- PostgreSQL 9.5+
- Rails 5.2+

## Installation

1. Add `skiplock` to your application's Gemfile:
Expand Down Expand Up @@ -52,12 +58,12 @@ The library is quite small compared to other PostgreSQL job queues (eg. *delay_j
workers: 0
```
Available configuration options are:
- **logging** (*enumeration*): sets the logging capability to **true** or **false**; setting to **timestamp** will enable logging with timestamps
- **logging** (*enumeration*): sets the logging capability to **true** or **false**; setting to **timestamp** will enable logging with timestamps. The log files are: log/skiplock.log and log/skiplock.error.log
- **min_threads** (*integer*): sets minimum number of threads staying idle
- **max_threads** (*integer*): sets the maximum number of threads allowed to run jobs
- **max_retries** (*integer*): sets the maximum attempt a job will be retrying before it is marked expired (see Retry System for more details)
- **purge_completion** (*boolean*): when set to **true** will delete jobs after they were completed successfully; if set to **false** then the completed jobs should be purged periodically to maximize performance (eg. clean up old jobs after 3 months)
- **workers** (*integer*) sets the maximum number of processes when running in standalone mode using the **skiplock** executable; setting this to 0 will enable async mode
- **workers** (*integer*) sets the maximum number of processes when running in standalone mode using the `skiplock` executable; setting this to 0 will enable **async mode**

#### Async mode
When **workers** is set to **0** then the jobs will be performed in the web server process using separate threads. If using multi-worker cluster web server like Puma, then it should be configured as below:
Expand Down Expand Up @@ -89,36 +95,57 @@ The library is quite small compared to other PostgreSQL job queues (eg. *delay_j
```ruby
MyJob.set(wait: 5.minutes, priority: 10).perform_later(1,2,3)
```
- Outside of Rails application, queue the jobs by inserting the Job records directly to the database table eg:
- Outside of Rails application, queue the jobs by inserting the job records directly to the database table eg:
```sql
INSERT INTO skiplock.jobs(job_class) VALUES ('MyJob');
```
- Or with scheduling, priority and arguments:
```sql
INSERT INTO skiplock.jobs(job_class, priority, scheduled_at, data) VALUES ('MyJob', 10, NOW() + INTERVAL '5 min', '{"arguments":[1,2,3]}');
INSERT INTO skiplock.jobs(job_class,priority,scheduled_at,data) VALUES ('MyJob',10,NOW()+INTERVAL '5 min','{"arguments":[1,2,3]}');
```
## Cron system
Skiplock supports cron jobs for running tasks periodically. It fully supports the cron syntax to specify the frequency of the jobs. To setup a job with cron capability, simply assign a valid cron expression to constant CRON for the Job Class.
- setup MyJob to run as cron job every hour at 30 minutes past
Skiplock provides the capability to setup cron jobs for running tasks periodically. It fully supports the cron syntax to specify the frequency of the jobs. To setup a cron job, simply assign a valid cron schedule to the constant `CRON` for the Job Class.
- setup `MyJob` to run as cron job every hour at 30 minutes past

```ruby
class MyJob < ActiveJob::Base
CRON = "30 * * * *"
# ...
end
```
- setup MyJob to run at midnight every Wednesdays
- setup `CleanupJob` to run at midnight every Wednesdays
```ruby
class CleanupJob < ApplicationJob
CRON = "0 0 * * 3"
# ...
end
```
- to remove the cron schedule from the job, simply comment out the constant definition or delete the line then re-deploy the application; at startup, the cron jobs that were undefined will be removed automatically
- to remove the cron schedule from the job, simply comment out the constant definition or delete the line then re-deploy the application. At startup, the cron jobs that were undefined will be removed automatically

## Retry system
...
## Notification system
## Retry system
Skiplock fully supports ActiveJob built-in retry system. It also has its own retry system for fallback. To use ActiveJob retry system, define the rescue blocks per ActiveJob's documentation.
- configures `MyJob` to retry at maximum 20 attempts on StandardError with fixed delay of 5 seconds
```ruby
class MyJob < ActiveJob::Base
retry_on StandardError, wait: 5, attempts: 20
# ...
end
```
- configures `MyJob` to retry at maximum 10 attempts on StandardError with exponential delay
```ruby
class MyJob < ActiveJob::Base
retry_on StandardError, wait: :exponentially_longer, attempts: 10
# ...
end
```
Once the retry attempt limit configured in ActiveJob has been reached, the control will be passed back to `skiplock` to be marked as an expired job.
If the rescue blocks are not defined, then the built-in retry system of `skiplock` will kick in automatically. The retrying schedule is using an exponential formula (5 + 2**attempt). The `skiplock` configuration `max_retries` determines the the limit of attempts before the failing job is marked as expired. The maximum retry limit can be set as high as 20; this allows up to 12 days of retrying before the job is marked as expired.
## Notification system
...
## Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/vtt/skiplock.
Expand Down
5 changes: 3 additions & 2 deletions lib/generators/skiplock/templates/migration.rb.erb
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ class CreateSkiplockSchema < ActiveRecord::Migration<%= "[#{ActiveRecord::VERSIO
t.integer :executions
t.jsonb :exception_executions
t.jsonb :data
t.boolean :running, null: false, default: false
t.timestamp :expired_at
t.timestamp :finished_at
t.timestamp :scheduled_at
Expand All @@ -26,8 +27,8 @@ class CreateSkiplockSchema < ActiveRecord::Migration<%= "[#{ActiveRecord::VERSIO
$$ LANGUAGE plpgsql
)
execute "CREATE TRIGGER notify_job AFTER INSERT OR UPDATE ON skiplock.jobs FOR EACH ROW EXECUTE PROCEDURE skiplock.notify()"
execute "CREATE INDEX jobs_index ON skiplock.jobs(scheduled_at ASC NULLS FIRST, priority ASC NULLS LAST, created_at ASC) WHERE expired_at IS NULL AND finished_at IS NULL"
execute "CREATE INDEX jobs_retry_index ON skiplock.jobs(scheduled_at) WHERE executions IS NOT NULL AND expired_at IS NULL AND finished_at IS NULL"
execute "CREATE INDEX jobs_index ON skiplock.jobs(scheduled_at ASC NULLS FIRST, priority ASC NULLS LAST, created_at ASC) WHERE running = 'f' AND expired_at IS NULL AND finished_at IS NULL"
execute "CREATE INDEX jobs_retry_index ON skiplock.jobs(scheduled_at) WHERE running = 'f' AND executions IS NOT NULL AND expired_at IS NULL AND finished_at IS NULL"
execute "CREATE INDEX jobs_cron_index ON skiplock.jobs(scheduled_at ASC NULLS FIRST, priority ASC NULLS LAST, created_at ASC) WHERE cron IS NOT NULL AND finished_at IS NULL"
execute "CREATE UNIQUE INDEX jobs_unique_cron_index ON skiplock.jobs (job_class) WHERE cron IS NOT NULL"
end
Expand Down
1 change: 1 addition & 0 deletions lib/skiplock/cron.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ def self.setup
time = self.next_schedule_at(cron)
if time
job.cron = cron
job.running = false
job.scheduled_at = Time.at(time)
job.save!
cronjobs << j.name
Expand Down
4 changes: 1 addition & 3 deletions lib/skiplock/dispatcher.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,12 @@ def initialize(master: true)
@executor = Concurrent::ThreadPoolExecutor.new(min_threads: Settings['min_threads'], max_threads: Settings['max_threads'], max_queue: Settings['max_threads'], idletime: 60, auto_terminate: true, fallback_policy: :discard)
@master = master
@next_schedule_at = Time.now.to_f
@running = false
@running = true
end

def run
Thread.new do
Rails.application.reloader.wrap do
@running = true
sleep(0.1) while @running && !Rails.application.initialized?
ActiveRecord::Base.connection_pool.with_connection do |connection|
connection.exec_query('LISTEN skiplock')
Expand Down Expand Up @@ -85,7 +84,6 @@ def do_work
break
end
rescue Exception => e
puts e.inspect
# TODO: Report exception
ensure
ActiveRecord::Base.connection_pool.checkin(connection)
Expand Down
21 changes: 12 additions & 9 deletions lib/skiplock/job.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,11 @@ class Job < ActiveRecord::Base
# Return: Attributes hash of the Job if it was executed; otherwise returns the next Job's schedule time in FLOAT
def self.dispatch(connection: ActiveRecord::Base.connection)
connection.exec_query('BEGIN')
job = connection.exec_query("SELECT * FROM #{self.table_name} WHERE expired_at IS NULL AND finished_at IS NULL ORDER BY scheduled_at ASC NULLS FIRST, priority ASC NULLS LAST, created_at ASC FOR UPDATE SKIP LOCKED LIMIT 1").first
if job && job['scheduled_at'].to_f <= Time.now.to_f
job = connection.exec_query("SELECT * FROM #{self.table_name} WHERE running = 'f' AND expired_at IS NULL AND finished_at IS NULL ORDER BY scheduled_at ASC NULLS FIRST, priority ASC NULLS LAST, created_at ASC FOR UPDATE SKIP LOCKED LIMIT 1").first
if job && job['scheduled_at'].to_f <= Time.now.to_f # found job ready to perform
# update the job to mark it in progress in case database server goes down during job execution
connection.exec_query("UPDATE #{self.table_name} SET running = 't' WHERE id = '#{job['id']}'")
connection.exec_query('END') # close the transaction commit the state of job in progress
executions = (job['executions'] || 0) + 1
exceptions = job['exception_executions'] ? JSON.parse(job['exception_executions']) : {}
data = job['data'] ? JSON.parse(job['data']) : {}
Expand All @@ -25,31 +28,31 @@ def self.dispatch(connection: ActiveRecord::Base.connection)
# TODO: report exception
exceptions["[#{ex.class.name}]"] = (exceptions["[#{ex.class.name}]"] || 0) + 1 unless exceptions.key?('activejob_retry')
if executions >= Settings['max_retries'] || exceptions.key?('activejob_retry')
connection.exec_query("UPDATE #{self.table_name} SET executions = #{executions}, exception_executions = '#{connection.quote_string(exceptions.to_json.to_s)}', expired_at = NOW(), updated_at = NOW() WHERE id = '#{job['id']}' RETURNING *").first
connection.exec_query("UPDATE #{self.table_name} SET running = 'f', executions = #{executions}, exception_executions = '#{connection.quote_string(exceptions.to_json.to_s)}', expired_at = NOW(), updated_at = NOW() WHERE id = '#{job['id']}' RETURNING *").first
else
timestamp = Time.now + (5 * 2**executions)
connection.exec_query("UPDATE #{self.table_name} SET executions = #{executions}, exception_executions = '#{connection.quote_string(exceptions.to_json.to_s)}', scheduled_at = TO_TIMESTAMP(#{timestamp.to_f}), updated_at = NOW() WHERE id = '#{job['id']}' RETURNING *").first
connection.exec_query("UPDATE #{self.table_name} SET running = 'f', executions = #{executions}, exception_executions = '#{connection.quote_string(exceptions.to_json.to_s)}', scheduled_at = TO_TIMESTAMP(#{timestamp.to_f}), updated_at = NOW() WHERE id = '#{job['id']}' RETURNING *").first
end
elsif exceptions.key?('activejob_retry')
connection.exec_query("UPDATE #{self.table_name} SET executions = #{job_data['executions']}, exception_executions = '#{connection.quote_string(job_data['exception_executions'].to_json.to_s)}', scheduled_at = TO_TIMESTAMP(#{job_data['scheduled_at'].to_f}), updated_at = NOW() WHERE id = '#{job['id']}' RETURNING *").first
connection.exec_query("UPDATE #{self.table_name} SET running = 'f', executions = #{job_data['executions']}, exception_executions = '#{connection.quote_string(job_data['exception_executions'].to_json.to_s)}', scheduled_at = TO_TIMESTAMP(#{job_data['scheduled_at'].to_f}), updated_at = NOW() WHERE id = '#{job['id']}' RETURNING *").first
elsif job['cron']
data['last_cron_run'] = Time.now.utc.to_s
next_cron_at = Cron.next_schedule_at(job['cron'])
if next_cron_at
connection.exec_query("UPDATE #{self.table_name} SET scheduled_at = TO_TIMESTAMP(#{next_cron_at}), executions = 1, exception_executions = NULL, data = '#{connection.quote_string(data.to_json.to_s)}', updated_at = NOW() WHERE id = '#{job['id']}' RETURNING *").first
connection.exec_query("UPDATE #{self.table_name} SET running = 'f', scheduled_at = TO_TIMESTAMP(#{next_cron_at}), executions = 1, exception_executions = NULL, data = '#{connection.quote_string(data.to_json.to_s)}', updated_at = NOW() WHERE id = '#{job['id']}' RETURNING *").first
else
connection.exec_query("DELETE FROM #{self.table_name} WHERE id = '#{job['id']}' RETURNING *").first
end
elsif Settings['purge_completion']
connection.exec_query("DELETE FROM #{self.table_name} WHERE id = '#{job['id']}' RETURNING *").first
else
connection.exec_query("UPDATE #{self.table_name} SET executions = #{executions}, exception_executions = NULL, finished_at = NOW(), updated_at = NOW() WHERE id = '#{job['id']}' RETURNING *").first
connection.exec_query("UPDATE #{self.table_name} SET running = 'f', executions = #{executions}, exception_executions = NULL, finished_at = NOW(), updated_at = NOW() WHERE id = '#{job['id']}' RETURNING *").first
end
else
connection.exec_query('END')
job ? job['scheduled_at'].to_f : Float::INFINITY
end
ensure
connection.exec_query('END')
Thread.current[:skiplock_dispatch_data] = nil
end

Expand All @@ -59,11 +62,11 @@ def self.enqueue_at(job, timestamp)
Thread.current[:skiplock_dispatch_data]['executions'] = job.executions
Thread.current[:skiplock_dispatch_data]['exception_executions'] = job.exception_executions
Thread.current[:skiplock_dispatch_data]['scheduled_at'] = Time.at(timestamp)
self.new(Thread.current[:skiplock_dispatch_data].slice(*self.column_names).merge(id: job.job_id))
else
timestamp = Time.at(timestamp) if timestamp
Job.create!(id: job.job_id, job_class: job.class.name, queue_name: job.queue_name, locale: job.locale, timezone: job.timezone, priority: job.priority, executions: job.executions, data: { 'arguments' => job.arguments }, scheduled_at: timestamp)
end
false
end
end
end
5 changes: 3 additions & 2 deletions lib/skiplock/manager.rb
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@ def self.standalone
$stdout = Demux.new(logfile, STDOUT, timestamp: log_timestamp)
errfile = File.open('log/skiplock.error.log', 'a')
errfile.sync = true
Rails.logger.level = 0
$stderr = Demux.new(errfile, STDERR, timestamp: log_timestamp)
logger = ActiveSupport::Logger.new($stdout)
logger.level = Rails.logger.level
Expand All @@ -66,8 +65,9 @@ def self.standalone
shutdown = false
Signal.trap("INT") { shutdown = true }
Signal.trap("TERM") { shutdown = true }
Settings['workers'].times do
Settings['workers'].times do |w|
fork do
Process.setproctitle("skiplock-worker[#{w+1}]")
dispatcher = Dispatcher.new(master: false)
thread = dispatcher.run
loop do
Expand All @@ -80,6 +80,7 @@ def self.standalone
end
end
sleep 0.1
Process.setproctitle("skiplock-master")
dispatcher = Dispatcher.new
thread = dispatcher.run
loop do
Expand Down
2 changes: 1 addition & 1 deletion lib/skiplock/version.rb
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
module Skiplock
VERSION = Version = '1.0.2'
VERSION = Version = '1.0.3'
end

0 comments on commit 73a1d0e

Please sign in to comment.