Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about locking internals #212

Closed
reczy opened this issue Jan 29, 2021 · 2 comments
Closed

Question about locking internals #212

reczy opened this issue Jan 29, 2021 · 2 comments

Comments

@reczy
Copy link
Contributor

reczy commented Jan 29, 2021

Hi @bensheldon,

While digging into the internals of Lockable, I came across your todo here:

def self.perform_with_advisory_lock
good_job = nil
result = nil
error = nil
unfinished.priority_ordered.only_scheduled.limit(1).with_advisory_lock do |good_jobs|
good_job = good_jobs.first
# TODO: Determine why some records are fetched without an advisory lock at all
break unless good_job&.executable?
result, error = good_job.perform
end
[good_job, result, error] if good_job
end

Can you provide a little more context behind that? Were you seeing actual good_job records trying to be performed without being locked? I'm trying to help out with this but I haven't been able to replicate. The advisory_lock scope should only return records that were able to acquire the lock (which is called immediately before invoking yield):

def with_advisory_lock
raise ArgumentError, "Must provide a block" unless block_given?
records = advisory_lock.to_a
begin
yield(records)
ensure
records.each(&:advisory_unlock)
end
end

Of course, the good_job variable is nil in GoodJob::Job.perform_with_advisory_lock whenever the advisory_lock scope returns 0 records (good_job = good_jobs.first # => nil but that doesn't seem to be what you meant in your todo.

As sort of a related question, does the advisory_lock scope ever return more than 1 record at a time in practice? The limit is hardcoded to 1 in GoodJob::Job.perform_with_advisory_lock, and I only see mentions of other limits in comments and tests.

@bensheldon
Copy link
Owner

Can you provide a little more context behind that? Were you seeing actual good_job records trying to be performed without being locked?

Yes, under heavy load I was seeing the Lockable's query that attempts to fetch and lock a record return records that had not been locked: #99 (comment)

I think this is caused by how Postgres materializes the query that select-and-locks, but that's been a bit deeper in Postgres than I'm familiar with:

it 'generates appropriate SQL' do
query = model_class.where(priority: 99).order(priority: :desc).limit(2).advisory_lock
expect(normalize_sql(query.to_sql)).to eq normalize_sql(<<~SQL.squish)
SELECT "good_jobs".*
FROM "good_jobs"
WHERE "good_jobs"."id" IN (
WITH "rows" AS #{'MATERIALIZED' if model_class.supports_cte_materialization_specifiers?} (
SELECT "good_jobs"."id"
FROM "good_jobs"
WHERE "good_jobs"."priority" = 99
ORDER BY "good_jobs"."priority" DESC
)
SELECT "rows"."id"
FROM "rows"
WHERE pg_try_advisory_lock(('x' || substr(md5('good_jobs' || "rows"."id"::text), 1, 16))::bit(64)::bigint)
LIMIT 2
)
ORDER BY "good_jobs"."priority" DESC
SQL
end

The temporary solution was to make an additional query to double-check that there is an advisory lock on the resulting record:

# Tests whether this job is safe to be executed by this thread.
# @return [Boolean]
def executable?
self.class.unscoped.unfinished.owns_advisory_locked.exists?(id: id)
end

Ideally the initial select-and-lock query would be reliable all of the time and make the 2nd double-checking query unnecessary. I don't think it has a big impact on performance to make 2 queries, but it would be nice to trust the first query.

As sort of a related question, does the advisory_lock scope ever return more than 1 record at a time in practice?

No. For the purpose of GoodJob it only ever select-and-locks 1 record at a time. The Lockable code was extracted from another project and I had the intention of subsequently extracting it from GoodJob into its own gem eventually.

@reczy
Copy link
Contributor Author

reczy commented Feb 1, 2021

Thanks @bensheldon - sorry I missed the previous discussion on this. It does indeed appear to be a rather complex issue underlying the missing locks.

In any case, I suppose this might be a moot issue if ActiveJob ends up with its own concurrency api as @morgoth referenced in #206 (comment) (thanks!).

Really looking forward to seeing the progression of that PR and happy to see that you're interested in supporting the new interface (assuming it does make it to rails) - thank you.

Please feel free to close this issue if you feel appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants