Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make health probe server more general purpose #1079

Merged
merged 30 commits into from
Jan 23, 2024

Conversation

jklina
Copy link
Contributor

@jklina jklina commented Sep 17, 2023

Howdy and thanks for the great library! 👋 I was looking into a way of adding an endpoint for monitoring metrics on job servers and saw that there was already an existing server used for health checks that uses the Rack interface. I did a quick spike that makes the health check server more general purpose and allows users to configure their own Rack apps for running on instances. Just wanted to get some input before pursuing it further and cleaning things up.


The health check and catchall logic are moved into simple Rack middleware that can be composed by users however they like and be used to preserve existing health check behavior while transitioning to a more general purpose utility server.

Additionally, this allows users to use WEBrick for the probe server in order to have a fully Rack compliant option.

All and all this pattern will allow users to add whatever functionality they like to GoodJob's web server by composing Rack apps and using GoodJob's configuration to pass in users' Rack apps. IE:

config.good_job.middleware = Rack::Builder.app do
  use GoodJob::Middleware::MyCustomMiddleware
  use GoodJob::Middleware::PrometheusExporter
  use GoodJob::Middleware::Healthcheck
  run GoodJob::Middleware::CatchAll
end
config.good_job.middleware_port = 7001

This could help resolve:

This removes the health check logic from the ProbeServer and renames the
ProbeServer to UtilityServer that accepts any Rack based app.

The health check and catchall logic are moved into simple Rack middleware
that can be composed by users however they like and be used to preserve
existing health check behavior while transitioning to a more general
purpose utility server.

All and all this pattern will allow users to add whatever functionality
they like to GoodJob's web server by composing Rack apps and using
GoodJob's configuration to pass in users' Rack apps. IE:

```
config.good_job.middleware = Rack::Builder.app do
  use GoodJob::Middleware::MyCustomMiddleware
  use GoodJob::Middleware::PrometheusExporter
  use GoodJob::Middleware::Healthcheck
  run GoodJob::Middleware::CatchAll
end
config.good_job.middleware_port = 7001
```

This could help resolve:

* bensheldon#750
* bensheldon#532
Copy link
Owner

@bensheldon bensheldon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings:

  • I'm worried that folks will want more and more out of the probe server (it's a single thread / blocking) and that's not where I want to spend me development time. The extreme reaction (which I realize you're not proposing but is my fear) is that if someone needs a webserver, they should be running GoodJob in Puma, not recreating Puma in GoodJob.
  • I'm not sure how defensive to be if folks are running arbitrary Rack Apps. Like should GoodJob be including ActionDispatch::Executor middleware in case someone adds customize middleware that touches Rails autoloaded objects or Active Record even though GoodJob doesn't need it itself. Because otherwise there might be deadlocks or leaked database connections which I imagine will be a support burden.
  • I recognize that being able to add prometheus metrics or other custom healthchecks would be nice without having to spin up yet-another-process (though maybe if you're already running kube and prometheus that shouldn't be a dealbreaker). But I get it, that would be helpful for advanced systems.

So maybe my objections here could be overcome fully by defensive naming:

  • Leaving the name of it as ProbeServer implying the the purpose is for probing the process (I think "Utility" is too vague).
  • "Middleware" is a bit overloaded in the background job space, as it's also used for describing job customization (though not in GoodJob). We should keep it namespaced as ProbeServer::Middleware::....,
  • Let's call the configuration for the Rack app: config.good_job.probe_server_app =

Otherwise I like the PR 😄 I think it cleaves the behaviors in the right place and allows flexibility without necessitating it.

@shouichi
Copy link
Contributor

Is Good::HttpServer meant to be rack compliant? The current implementation doesn't seem so. Rack specifies various mandatory fields and some rack apps assume those fields to exist. Thus some apps won't work.

@jklina
Copy link
Contributor Author

jklina commented Sep 25, 2023

@shouichi probably not. The more I think about this the more I'm worried it'll open a can of worms. I'm thinking of just closing this for now and if there's more demand down the line, I'd be happy to revisit and see it through making its limitations very visible.

@bensheldon
Copy link
Owner

oh, the Rack spec is kind of a lot: https://github.com/rack/rack/blob/main/SPEC.rdoc

I do like the idea of allowing it to be extensible, but agree that I don't want to set the expectation that anything more than a trivial rack middleware will work.

@shouichi
Copy link
Contributor

I tried a "trivial" example from https://www.rubydoc.info/gems/rack/Rack/Builder but didn't work.

app = Rack::Builder.new do
  use Rack::CommonLogger
  map "/ok" do
    run lambda { |env| [200, {'content-type' => 'text/plain'}, ['OK']] }
  end
end

run app

The spec is a lot, implementing it in GoodJob might be a rabbit hole. I guess it's better to use an existing server (e.g., webrick).

Other than the spec, error handling is not trivial work. I recently opened #1083 but there are potentially more corner cases. Search Errno in webrick/puma, you'll see a lot of them.

@bensheldon
Copy link
Owner

I think I do like the idea of pluggable middleware; it would be nice if any Rack middleware would work. I think that has me lean towards reintroducing webrick.

Would it be possible to keep the array of middleware and add an additional option that was like (ugh, I don't like the name):

config.good_job.probe_server_server = :good_job # or webrick

And then folks could choose and we'd just have a warning in the readme of like "the default :good_job is very barebones and likely won't work if you get complicated.

@jklina
Copy link
Contributor Author

jklina commented Dec 4, 2023

I'm wondering, if Webrick is going to become a dependency of GoodJob anyway, do we keep the current homespun server or nix it completely? Removing the homespun server would be one less configuration option to worry about (I feel like there might be some confusion about selecting servers) and anything that'd work with the homespun server should work with Webrick.

If that's the case maybe we break this into two parts:

@bensheldon
Copy link
Owner

if Webrick is going to become a dependency of GoodJob anyway

Because it would be optional, I wouldn't add Webrick to the gemspec, and instead simply require it, if configured, and rescue the LoadError if it hasn't been added.

I hear you about complexity though 😰

@jklina jklina force-pushed the make-web-server-general-purpose branch from dac7ded to ac2885c Compare January 2, 2024 19:36
Since the probe server has the option to use WEBrick as a server
handler, but this library doesn't have WEBrick as a dependency,
we want to throw a warning when WEBrick is configured, but not in the
load path. This will also gracefully fallback to the built in HTTP
server.
As opposed to manipulating the load path.
@jklina jklina marked this pull request as ready for review January 5, 2024 21:05
@jklina
Copy link
Contributor Author

jklina commented Jan 5, 2024

I went ahead and took a stab at implementing the suggestions. I'll keep my eyes open for any further suggestions during review!

@bensheldon
Copy link
Owner

I like this! I just pushed up a few tweaks and I will accept this.

@bensheldon bensheldon added the enhancement New feature or request label Jan 23, 2024
@bensheldon bensheldon merged commit eec1f4b into bensheldon:main Jan 23, 2024
20 checks passed
legendarydeveloper919 added a commit to legendarydeveloper919/good_job that referenced this pull request Mar 15, 2024
* Make health probe server more general purpose

This removes the health check logic from the ProbeServer and renames the
ProbeServer to UtilityServer that accepts any Rack based app.

The health check and catchall logic are moved into simple Rack middleware
that can be composed by users however they like and be used to preserve
existing health check behavior while transitioning to a more general
purpose utility server.

All and all this pattern will allow users to add whatever functionality
they like to GoodJob's web server by composing Rack apps and using
GoodJob's configuration to pass in users' Rack apps. IE:

```
config.good_job.middleware = Rack::Builder.app do
  use GoodJob::Middleware::MyCustomMiddleware
  use GoodJob::Middleware::PrometheusExporter
  use GoodJob::Middleware::Healthcheck
  run GoodJob::Middleware::CatchAll
end
config.good_job.middleware_port = 7001
```

This could help resolve:

* bensheldon/good_job#750
* bensheldon/good_job#532

* Use new API

* Revert server name change

We decided to leave the original ProbeServer name better sets
expectations. See:

bensheldon/good_job#1079 (review)

This also splits out middleware testing into separate specs.

* Restore original naming

This also helps ensure that the existing behavior and API remain intact.

* Appease linters

* Add required message for mock

* Make test description relevant

* Allow for handler to be injected into ProbeServer

* Add WEBrick WEBrick handler

* Add WEBrick as a development dependency

* Add WEBrick tests and configuration

* Add idle_timeout method to mock

* Namespace server handlers

* Warn and fallback when WEBrick isn't loadable

Since the probe server has the option to use WEBrick as a server
handler, but this library doesn't have WEBrick as a dependency,
we want to throw a warning when WEBrick is configured, but not in the
load path. This will also gracefully fallback to the built in HTTP
server.

* inspect load path

* Account for multiple webrick entries in $LOAD_PATH

* try removing load path test

* For error on require to initiate test

As opposed to manipulating the load path.

* Handle explicit nils in intialization

* Allow probe_handler to be set in configuration

* Add documentation for probe server customization

* Appease linter

* retrigger CI

* Rename `probe_server_app` to `probe_app`; make handler name a symbol; rename Rack middleware/app for clarity

* Update readme to have relevant app example

* Fix readme grammar

---------

Co-authored-by: Ben Sheldon [he/him] <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

3 participants