Replies: 5 comments 2 replies
-
not the exact same river config, but yes, sharing the same scheduled_job across both workers and insert only client is what we are doing. i ran into a similar problem here, you can see a response: #336 (comment) |
Beta Was this translation helpful? Give feedback.
-
Hi @magaldima, there's a lot in here and it will probably be fairly open-ended, so I converted it to a discussion for now. There may still be some fixable issues to come out of the conversation though so don't consider it a dismissal 🙂 This is definitely an area where we could use better documentation, particularly including some architectural diagrams. River provides some isolation between queues of different names operating in the same schema and database, but only really as far as which jobs get inserted & worked on the named queues. Higher level While you will get some isolation with different services enqueueing and running their jobs on different named queues, it's not full isolation—and really you should have i.e. periodic jobs configured the same across all started clients in a single Postgres We have also been working on full support for running totally isolated River instances across different Postgres schemas/search paths within a single database. I'm not certain we're 100% of the way there yet, but we are at least pretty close as of the v4 migration changes. One of the ways we intend to confirm we're there is to migrate River's test suite from handing out independent databases per-test to instead using schemas. With a schema-per-service setup, you would be able to have independent leaders per service, as well as different periodic job configs per service. If you decide to experiment with this, please let us know what you find as it might already work well! In theory it should "just work" by setting the
We have some plans for a more robust periodic job implementation that allows for perfect gap free scheduling even as leaders come & go. I'm hoping it's something we'll start making progress on during the next few months but it's hard to be sure, we're working on shipping a lot of new stuff (most notably a UI).
Thank you, this is great to hear 😄 ❤️ |
Beta Was this translation helpful? Give feedback.
-
Hi @bgentry and thanks for the additional information. While the schema-per-service setup may work I'm not sure how viable the option is given the overhead in managing n schemas. As for the plans mentioned around more robust periodic job scheduling, is there any additional info or resources that you can share? Will that solution also address the "problem" around periodic job state living in the river config? Thanks! |
Beta Was this translation helpful? Give feedback.
-
@magaldima Can you share anything broadly about the use case you have that requires that periodic jobs be so dynamic? River's definitely constructed in such a way that you're assumed to have most of the information about the periodic jobs you want at compile time. You can add more dynamically, but the set is still more static than dynamic, which for most cases of periodic jobs I've worked with in my career at least, would seem to me a sound model. You might change some, but doing so with a new code deploy is dynamic enough to be practical. It is possible here that you might be deviating far enough off the mainline happy path that of periodic jobs that a better solution for you might be to write your own periodic scheduling module that's more aware of and smarter about the specific domain you're working with. That might sound like a large task, but it's not that bad — River's internal one is < 500 LOCs (and a lot of that is fluff for testability and such). |
Beta Was this translation helpful? Give feedback.
-
Hey @brandur and @bgentry are there any updates that you can share here? We're now managing 4 DB schemas where we have deployed separate River instances. This is manageable for now but it's starting to add technical debt to our code and prevents us from being able to expose a single view of all jobs running across our set of services. We would love to be able to consolidate the queue while keeping scheduled job logic separated per service within our cluster (ideally we would partition service jobs via the |
Beta Was this translation helpful? Give feedback.
-
I've observed (expected) behavior while developing with River (v0.4.1) that periodic jobs are only enqueued by an elected leader which I now understand to be part of the maintenance service responsibilities of an elected leader.
The "problem" that I'm experiencing is that I'm attempting to use river (and initialize River Clients) across a few different services that all have different workers and different river periodic job configurations and are mostly isolated by different queues. So if I have service A with a periodic job that executes every hour and service B with a different periodic job, if service B is the elected leader for more than 1 hour, service A's periodic job will not be enqueued per its configured schedule.
This is less of an issue and more of a question around how River was designed and if this use case was considered. Presumably, several services sharing a background job queue is a common pattern and we'd like for each service to contain the logic of processing jobs specific to that service. Does this necessitate that scheduled jobs should be declared and managed by a single service or that all the different applications that create a River client need to use the same River config (with the same set of periodic jobs including those that are not in scope for the specific service)?
The other workaround currently under discussion at the moment is creating separate DB schemas per service and instantiating the River tables in each schema so that each service has it's own elected leader. On the surface this seems like something River should be able to handle and I'm worried about the increased overhead with maintaining separate DB schemas for each application.
I was hoping you would be able to share insights/context into River's design and if this was something that was considered at any point in time and how you thought about it and if this is on your radar/roadmap. Curious what your recommendation is for a solution: consolidating job definition vs. separate DB schemas (or something else entirely) ?
Another last idea that I had was to create a new DB table in River
river_periodic_job
where the definition/configuration for periodic jobs could live and be shared across all services in a way that the river client configuration state can be pushed into the DB layer. In this way, the elected leader would have visibility into all periodic jobs and be able to enqueue jobs based on the periodic jobs defined in that DB table. Curious your thoughts on this idea as well - happy to help put forth a proposal and contribute time/work if you think this feature would be doable.Thanks for the time for reading that through! I've been using River for a few months now and have loved it. If anything, this "problem" has arisen out of my exuberance to extend River's capabilities to many of our backend services.
Beta Was this translation helpful? Give feedback.
All reactions