Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating good_job breaks my Rails 7 alpha 2 local development #462

Closed
leehericks opened this issue Nov 30, 2021 · 24 comments
Closed

Updating good_job breaks my Rails 7 alpha 2 local development #462

leehericks opened this issue Nov 30, 2021 · 24 comments

Comments

@leehericks
Copy link

bundle outdated
good_job 2.7.0 2.7.2 ~> 2.4, >= 2.4.2 default

After running bundle update good_job gets mixed in with Puma initialization messages and then I can't get rails server to server pages in local development. Just hangs.

On the otherhand, at good_job 2.7.0 I'm also getting
[GoodJob] Notifier unsubscribed with UNLISTEN
[GoodJob] Notifier errored: no connection to the server
but I can access the site in the browser...

@leehericks
Copy link
Author

This is the hanging on 2.7.2

Screen Shot 2021-11-30 at 11 42 49

@bensheldon
Copy link
Owner

@leehericks thanks for reporting this. Sounds like a deadlock. I'll try to reproduce.

Could you share how and where you're setting up :good_job as the queue_adapter? Thank you!

@leehericks
Copy link
Author

Screen Shot 2021-11-30 at 12 15 51

Screen Shot 2021-11-30 at 12 16 02

@bensheldon
Copy link
Owner

@leehericks nuts, I'm having trouble reproducing it. Here is instructions for how you can try to diagnose a deadlock yourself: #236 (comment)

Also, if you wanted to find some time on my calendar, I'd love to Zoom on this: https://calendly.com/bensheldon/out-of-office-hours

@leehericks
Copy link
Author

@bensheldon Despite running ENV=RBTRACE bin/rails server --binding=0.0.0.0 I do not get output with the commands to run...

@leehericks
Copy link
Author

*** attached to process 6908

puts output = ActionDispatch::DebugLocks.new(nil).send(:render_details, nil); output
=> [200, {"Content-Type"=>"text/plain", "Content-Length"=>0}, [""]]
*** detached from process 6908

@leehericks
Copy link
Author

Screen Shot 2021-11-30 at 15 27 33

@leehericks
Copy link
Author

Does any of this apply to your project?

https://weblog.rubyonrails.org/2021/9/3/autoloading-in-rails-7-get-ready/

@bensheldon
Copy link
Owner

@leehericks probably not? GoodJob has used Zeitwerk internally for more than a year. Rails Alpha 2 also has been in the test matrix for a few months.

Unfortunately these kinds of issues are tricky to trace down. A few more questions:

  • Are you able to ctrl-c to exit, or is the terminal hanging?
  • Does it hang if you change your WEB_CONCURRENCY to zero? e.g. running Puma in single process mode
  • Could you try GoodJob v2.7.1 and tell me if you experience the same behavior? That would help me narrow down some of the changes.

Thank you again for the help here 🙏

@leehericks
Copy link
Author

@bensheldon

  1. ctrl-c hangs on ^C[11022] - Gracefully shutting down workers... but does finally finish with [11022] === puma shutdown: 2021-12-01 10:21:31 +0900 === [11022] - Goodbye! Exiting [GoodJob] Notifier unsubscribed with UNLISTEN

  2. Putting Puma web concurrency to 1 booted up but errored on first web request:

Screen Shot 2021-12-01 at 10 04 28

  1. Putting Puma web concurrency to 0 booted up and indeed the web request took a second but went through and then everything worked speedily.

Screen Shot 2021-12-01 at 10 07 34

  1. You don't have a version 2.7.1 in Ruby Gems

@bensheldon
Copy link
Owner

Just a quick reply re: 4. There should be a v2.7.1 in Ruby Gems:

https://rubygems.org/gems/good_job/versions/2.7.1

That's weird if it's not fetchable.

@bensheldon
Copy link
Owner

This is progress that web concurrency affects the behavior. Thank you for debugging this with me; I appreciate it!

Could you try adding something like this to your puma config? It explicitly shuts down and restarts GoodJob's thread pools when Puma forks worker processes:

https://github.com/bensheldon/good_job/blob/fe8b2c32f4f7b7883832e9d3a12966ca9450821a/spec/test_app/config/puma.rb#L37-52

@leehericks
Copy link
Author

Just a quick reply re: 4. There should be a v2.7.1 in Ruby Gems:

https://rubygems.org/gems/good_job/versions/2.7.1

That's weird if it's not fetchable.

Ok, now it worked with a bundle install. Before it actually showed all versions outside of 2.7.1 saying it wasn't available. 🤷🏼‍♂️

@leehericks
Copy link
Author

2.7.1 let's the http requests run, but good_job must not be loading because trying to access the job dashboard yields

Showing /Users/leehericks/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/gems/good_job-2.7.1/engine/app/views/good_job/executions/_table.erb where line #31 raised:

No route matches {:action=>"show", :controller=>"good_job/jobs", :id=>nil}, possible unmatched constraints: [:id]

@bensheldon
Copy link
Owner

Oh! I don't think this is the root cause, but one thing at a time: it looks like you have good_jobs records with null active_record_id columns. That became a required attribute as part of the 0.x -> 1.x upgrade process. Run this in your Rails console to update those records:

GoodJob::Execution.where(active_job_id: nil).update_all("active_job_id = (serialized_params->>'job_id')::uuid")

@leehericks
Copy link
Author

Oh! I don't think this is the root cause, but one thing at a time: it looks like you have good_jobs records with null active_record_id columns. That became a required attribute as part of the 0.x -> 1.x upgrade process. Run this in your Rails console to update those records:

GoodJob::Execution.where(active_job_id: nil).update_all("active_job_id = (serialized_params->>'job_id')::uuid")

That's interesting. I've got the migrations in my db folder...
Ran that, still have issues with "no connection to the server".

I'm going to set up an example app from scratch and see if I can reproduce this.

@leehericks
Copy link
Author

@bensheldon Ok, it completely comes down to the puma.rb web concurrency.

Screen Shot 2021-12-01 at 14 23 26

If both of these lines are commented out then the deadlock will not occur.

In Rails 6.1.4.1, it works with web concurrency while still getting this after the first request:

[GoodJob] Notifier unsubscribed with UNLISTEN
[GoodJob] Notifier errored: no connection to the server

[GoodJob] Notifier subscribed with LISTEN

As a note, I upgraded good_job from 2.4.2 to 2.7.3. Both versions work with puma web concurrency enabled in development on Rails 6.1.4.1.

So this problem specifically comes out in Rails 7.0.0-alpha2 unless that "Notifier errored" is also not normal.

@bensheldon
Copy link
Owner

bensheldon commented Dec 1, 2021

The "notifier errored" message is not normal.

You could add GoodJob.on_thread_error = ->(e) { puts e } [fixed] to your initializer to see if it outputs a better error message.

I'll pick this up tomorrow (Pacific time).

@leehericks
Copy link
Author

leehericks commented Dec 1, 2021

I couldn't get anything outputted with adding that to my initializer.
On a good note, for single mode the "Notifier errored" message has disappeared miraculously.

Thank you for your support!

@bensheldon
Copy link
Owner

oops that may be because there is a typo in the code: put -> puts.

@bensheldon
Copy link
Owner

@leehericks I just set up a new Rails 7.0.0.alpha2 with GoodJob and still can't duplicate the behavior you're seeing 😭 I added you as a collaborator on the repo if you want to pull it down and see if it gives you a different experience, or add your configuration and push it back up for me to try:

https://github.com/bensheldon/good_job_rails7/commits/main

Screen Shot 2021-12-01 at 7 41 55 AM

@leehericks
Copy link
Author

oops that may be because there is a typo in the code: put -> puts.

No, I had found that one. ☺️

@leehericks
Copy link
Author

After comparing and disabling gem by gem I think that good_job is not the cause of the hang. 😅
Seems like I need to check in with shioyama/mobility

@leehericks
Copy link
Author

@bensheldon I'm sorry to use so much of your time pinpointing the culprit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants