Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds the ability to delete jobs on the dashboard; add cleanup_discarded_jobs option to retain discarded jobs during cleanup #597

Merged
merged 6 commits into from
May 18, 2022

Conversation

TAGraves
Copy link
Contributor

Right now the only way to delete a job is to use the cleanup command. We have workflows that involve deleting jobs on an ad-hoc basis. For example, if a job fails for a reason that means that it can never be successfully retried, we want to delete that job so that it does not clutter up our discarded jobs page. The first commit in this PR adds the ability to delete any job in the dashboard (by calling destroy on the job).

Additionally, we do not want discarded jobs to ever be deleted automatically. In some cases we may have a failed job from a week ago that we wish to preserve so that we can easily retry it after we release a fix for the issue. In other words, we always want to manually handle discarded jobs, either by retrying them or by deleting them -- we never want them cleaned up automatically. So the second commit adds a configuration option to only delete finished but undiscarded jobs via the cleanup command. This option is set to delete discarded jobs by default for now, so that this does not become a breaking change, but I would recommend flipping this default with the 3.0 release.

I believe this takes care of #594.

Copy link
Owner

@bensheldon bensheldon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TAGraves This is fantastic! Thank you for adding this feature and extending the dashboard 🙌

Two things:

  • ActiveJobJob#destroy needs to be re-defined on the ActiveJobJob model because the native Rails #destroy method won't destroy the entire set of executions that make up the job (it will only destroy the HEAD execution, which then will make the HEAD~1 execution become the new HEAD of the Job). I've been kicking that problem down the road myself. There should be a #destroy_job method that locks the job and then destroys all of the executions that make up the job.
  • I think the naming should be consistent with Rails conventions and it should be "destroy job" rather than "delete".

I really appreciate the contribution (this is like 95% there) and I can help make those other changes to push it over the finish line.

@TAGraves
Copy link
Contributor Author

@bensheldon Regarding the first point, #destroy actually does seem to destroy all of the executions. You can add this spec to active_job_job_spec to prove it:

  describe '#destroy' do
    it 'destroys all executions for the job' do
      job.destroy

      expect { head_execution.reload }.to raise_error(ActiveRecord::RecordNotFound)
      expect { tail_execution.reload }.to raise_error(ActiveRecord::RecordNotFound)
    end
  end

Adding some logging, this is the SQL that is execution when running destroy:

D, [2022-05-13T11:53:18.660249 #23292] DEBUG -- :   GoodJob::ActiveJobJob Destroy (0.5ms)  DELETE FROM "good_jobs" WHERE "good_jobs"."active_job_id" = $1  [["active_job_id", "36508487-93f3-49ac-b19d-7b299bc18c53"]]

I think this is because the primary key is defined as active_job_id.

I think it's still fine to manually define it though to keep it consistent with the other operations in active_job_job.rb.

I'm more than happy to make the changes you mentioned to push it over the finish line!

@bensheldon
Copy link
Owner

@TAGraves Ooh, that's nice that Rails handles destroy with the explicit primary_key. TIL.

I do think there should be an explicit #destroy_job that advisory locks the job then calls #destroy.

Thought: are there limitations on what job states should be destroyable? Eg we shouldn't destroy a running job. Thinking aloud: should a job have to be discarded before it is destroyable?

@TAGraves
Copy link
Contributor Author

I think it's pretty reasonable to make only discarded jobs destroyable. I'll make these changes.

@TAGraves
Copy link
Contributor Author

@bensheldon I've added the #destroy_job method, renamed delete to destroy in a bunch of places (including ones that already existed), and limited destroy_job just to finished and discarded jobs.

@bensheldon bensheldon changed the title Adds the ability to delete jobs on the dashboard Adds the ability to delete jobs on the dashboard; add cleanup_discarded_jobs option to retain discarded jobs during cleanup May 17, 2022
@bensheldon bensheldon changed the title Adds the ability to delete jobs on the dashboard; add cleanup_discarded_jobs option to retain discarded jobs during cleanup Adds the ability to delete jobs on the dashboard; add cleanup_discarded_jobs option to retain discarded jobs during cleanup May 17, 2022
@bensheldon
Copy link
Owner

@TAGraves thank you! This looks great 🎉

In terms of release orchestration, I'm going to merge in #575 first, then I will fix conflicts and get this PR merged.

I also intend to remove the ability to delete an individual execution. Now that whole jobs can be deleted, I don't think it makes sense to allow removal individual executions. I'll do that as a separate PR just to expedite getting this one released.

@rgaufman
Copy link

Quick question, where can I find documentation or an example for destroy_job and finding scheduled jobs? - what I'm trying to do is to find a specific scheduled job and cancel it (from the command line).

@bensheldon
Copy link
Owner

bensheldon commented May 30, 2022

@rgaufman GoodJob::ActiveJobJob is a queryable ActiveRecord model:

jobs = GoodJob::ActiveJobJob.where(scheduled_at: 30.minutes.ago...Time.current)
jobs.each(&:destroy_job)

Edit: you can find existing scopes here:

# Get Jobs with given class name
# @!method job_class
# @!scope class
# @param string [String] Execution class name
# @return [ActiveRecord::Relation]
scope :job_class, ->(job_class) { where("serialized_params->>'job_class' = ?", job_class) }
# Get Jobs finished before the given timestamp.
# @!method finished_before(timestamp)
# @!scope class
# @param timestamp (DateTime, Time)
# @return [ActiveRecord::Relation]
scope :finished_before, ->(timestamp) { where(arel_table['finished_at'].lteq(timestamp)) }
# First execution will run in the future
scope :scheduled, -> { where(finished_at: nil).where('COALESCE(scheduled_at, created_at) > ?', DateTime.current).where("(serialized_params->>'executions')::integer < 2") }
# Execution errored, will run in the future
scope :retried, -> { where(finished_at: nil).where('COALESCE(scheduled_at, created_at) > ?', DateTime.current).where("(serialized_params->>'executions')::integer > 1") }
# Immediate/Scheduled time to run has passed, waiting for an available thread run
scope :queued, -> { where(finished_at: nil).where('COALESCE(scheduled_at, created_at) <= ?', DateTime.current).joins_advisory_locks.where(pg_locks: { locktype: nil }) }
# Advisory locked and executing
scope :running, -> { where(finished_at: nil).joins_advisory_locks.where.not(pg_locks: { locktype: nil }) }
# Completed executing successfully
scope :finished, -> { not_discarded.where.not(finished_at: nil) }
# Errored but will not be retried
scope :discarded, -> { where.not(finished_at: nil).where.not(error: nil) }
# Not errored
scope :not_discarded, -> { where(error: nil) }

@TAGraves TAGraves deleted the tg-add-delete branch January 30, 2023 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants