Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ORM to support offloading read-only queries to another database #11352

Open
jakxnz opened this issue Aug 27, 2024 · 0 comments
Open

Comments

@jakxnz
Copy link
Contributor

jakxnz commented Aug 27, 2024

We want to introduce support for application environments that use multiple databases for performance objectives, by developing support for the ORM to offload read-only queries to an alternate database to allow developers or vendors with platforms/infrastructure for Silverstripe CMS to alleviate load on a primary write database and improve scalability and performance of the overall database system

Related issues

Notes

  • Take a look at how this is typically implemented in other ORMs and how it's used. There are likely unknown-unknowns about how this needs to function in order to be truly useful
  • We need to decide whether to allow explicitly using the primary DB for read operations for specific scenarios even when there's a read-only db available (e.g. if you immediately need to fetch the value you just wrote to the DB)
  • Other ORMs that allow read-only replicas actually allow an arbitrary number of replicas, not just the one. For example see Doctrine's replicas config and their old master/slave setup. We should probably aim to do the same, for maximum flexibility and scalability.
  • Will probably look at raw SQL and if it's a SELECT always use a read-replica, otherwise use the primary. However this means that any unsynced data probably won't be there. One solution could be to have a DB::withPrimary($callback) to ensure the primary database is used.
  • Laravel has a sticky option that means use the write database for all queries going forward in the current request cycle after a write operation has been performed. This solves the issue of needing to read something after writing straight away. It may make sense for us to do something similar, possibly just have it permanently on and not allow configuring it off as there's probably a million ways something could get out of sync and break. There may however be some scenarios where the database is being written to on every request e.g. custom logging which would disable using the replicas

Acceptance criteria

  • At least one additional database can be optionally configured via environment variables for read-only operations
  • When a read-only database is available, the primary database is only used for write operations (Steve: I'm not sure about this one, it may make sense to do something like the /admin section and cli always uses the primary, while the website frontend should use a replica though switches to the primary when needed e.g. userform submission)
  • The implications of when data is available are fully understood - i.e. if writing to one db and then reading from another, should developers expect the new data to be available immediately or not?
  • Documentation clearly outlines both how to set this up, and any gotchas that may arise.
  • This will probably involve an API break and require it be in CMS 6, though ideally we can implement this in CMS 5 so that the real world benefits are realised sooner

Kitchen sink CI

PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants