-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow views and tables to self-update to the latest config #101
Comments
Candidate approach: Each view and table row should have a new meta field, There will be two strategies for updating tables and views:
Finally, any regenerations required should go to a specific queue and be done in the background so we can effectively throttle the rate of self-update. |
One thing we need to consider is how to come from a large data set with no |
Theoretically, in the absence of the hash, Tripod could generate a hash and it should be the same. |
Do we need to use hashing? An alternative might be to persist versions of the table spec, and treat it as immutable, then each view/tablerow could contain a reference to the specific version of the tablespec used to generate it. It might be more traceable if we followed that approach, and would permit easier rollbacks. A just-in-time migration would be possible, and would, I think, avoid the problem of "no
Recalculation need not be happening at this point, as the hash/version can be considered equal (i.e. When a tablespec is updated, we will add the version/hash. Now, as rows/views are fetched, we can tell that they are out of date: the tablespec's non-null hash/version will not equal the row/view's null hash/version. |
I've conflated two issues. We could use hashes and immutable versions - it's just the version identifier. I just thought the (small) expense of hashing seemed unnecessary if a simple version numbering scheme would work. Is there a strong case for hashing as opposed to atomic counters? I suppose atomic counters come with consistency problems, whereas hashes do not... |
Historically the reason we haven't stored specs in the database is you need to read all the specs from the database before you can actually do anything. And local caching isn't good enough - especially if we are hashing versions - because it causes chaos in the stale cache window. Consider a piece of frequently read data - one node has a stale spec cache and regenerates specs to an old version whilst another has fresh spec and puts it back again - the nodes all play tennis until they are operating on the same spec. So the alternative would be either always read the specs on each request (expensive) or look at some kind of coordinated distributed config service such as Zookeeper (extra complexity). My gut tells me that although ultimately something designed for this problem like Zookeeper is the right way to solve this, it's not something we should introduce lightly and with a major feature change like this at the same time. You also complicate the release process - how will you co-ordinate the updates to the spec in the DB with the rollout of new code (that relies on it) to N nodes? So, in summary, my push to keep the specs on disc with the app code they relate to at the moment are: (a) it's fast to calculate, something like an md5 of the spec on disk is quick and reliable enough to then compare with what is in returned data |
On hashing and atomic counters - I prefer hashing. Three reasons -
|
I agree with most of the above. I have a question, though: Is there only one instance of the code running Consider, for example, you have two machines running the code, and due to In this case, order provides a natural resolution to the problem, where However, if it's a possibility we should consider it, at least so we know
|
Absolutely true. Not considered that scenario. |
In progress => #102 |
At the moment, if a table or view specification changes, it is a manual process to re-run generation and upgrade data to the latest spec.
Instead, allow tripod to automatically upgrade data when it encounters data not generated with the latest spec.
It is acceptable that this is an eventually consistent feature, seeing as views and tables are generally considered eventually consistent.
The text was updated successfully, but these errors were encountered: