Enforce zero merge conflicts #172

joshuakarp · 2021-05-28T01:50:44Z

I want to explore the idea of making Polykey NEVER have a merge conflict.

How do we do this? Well we must limit the control surface of pushes/pulls and changes so that this can never occur.

One idea is this. As soon as a vault has been pushed to a keynode. If that keynode changes that vault. It doesn't mutate that vault. Instead what happens is that the vault is forked, and represents a new vault.

This idea is actually similar to persistent data structures. That is if every vault is a chain of states/changes. Then that makes it a linked list. Any subsequent change doesn't mean the old version of the data structure is gone.

In Haskell and in js-resource-counter we have persistent data structures. Sharing a linked list between 2 functions that change the list doesn't affect each other.

This also means you can never rewrite history. You can only go ahead. Vaults are there append-only hash chains (linked lists).

This also means visible vaults are like garbage collected "handles". Or views of the data structures. You can view at any point.

Then pushing a vault from source keynode to target keynode is about appending new changes to a particular vault view.

All the vaults can then form a tree. But any specific vault forms a linear chain back to the beginning.

This should prevent any possibility of merge conflicts.

Other ideas include CRDTs.

This may impact how we identify vaults. Instead of randomly generated vaultids. They are some deterministically generated identifier of a particular position in the tree of commits.

Resolved in https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/205

joshuakarp · 2021-05-28T01:51:53Z

@CMCDragonkai : This means mutating a vault that isn't "owned" by you should fork a new branched vault and you never merge back. This behaviour needs to be verified @DrFacepalm @scottmmorris

CMCDragonkai · 2021-07-13T00:35:48Z

Consider that the GitHub wiki has the same problem if there are concurrent editors. I wonder how GitHub wiki enforces zero merge conflicts, since there is no interface to fix merge conflicts if 2 people edit the same wiki page. Good idea to investigate the GitHub wiki and see what we should do our vaults if this occurs.

CMCDragonkai · 2021-07-13T00:37:42Z

While reviewing the Vault class right now in client-refactoring, I can see that it's quite high level doesn't expose the underlying fs interface much at all. If we later want to expose the vault has a mountable filesystem, this will need to be changed later.

Actually I might have to change how fs operations translate to git commits quite significantly...

CMCDragonkai · 2021-07-14T09:31:33Z

Designing this requires completing these MRs:

https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/205 - vaults review/refactoring
https://gitlab.com/MatrixAI/Engineering/Polykey/js-encryptedfs/-/merge_requests/46 - encryptedfs review/refactoring

CMCDragonkai · 2021-07-14T09:34:26Z

Copying this comment which is relevant to this design:

Vaults are an abstraction on top of EFS.

Right now we are creating an explicit set of operations for a vault.

But there is a more flexible way to do this.

Instead expose the entire EFS, so it can be used like a normal FS.

The main difference is that all FS operations must be wrapped together to form a "transaction".

That transaction is then committed via the iso git integration.

Here's an example:

vault.commit((fs) => {
  await promisify(fs.readFile).bind(fs)();
  await promisify(fs.writeFile).bind(fs)();
  // ...
});

Right now there's no promise interface that EFS nor VFS exposes. So for now you have to use promisify or callback chaining.

The commit method automatically takes all changes there and turns it into a Git commit.

This requires a few things:

the fs object passed in has to be a sort of proxy on top of EFS
the proxy has to allow us to "observe" what the changes are, this is what allows to make an "automatic git commit message"
the fs interface has to be limited right now, in particular we should not provide access to the underlying .git directory
the translation of high level commands like pk secrets put vault:secretfile would then have to be done at the highest level, like in client service

Given the above interface is a "callback" style, I don't think this is supported in GRPC yet, so it's not possible to have the CLI/GUI pass a "callback" to the server to act on. The closest thing would be to create a client to server stream, where the client streams commands to the server, and the server translates these into a callback. But this is overkill atm. (Alternatively the client constructs a serialized command language, and sends it over to the server to execute (like having a SQL thing)).

So the final effect of this is that the Vault methods are still really general, and that actual translation of high level commands from CLI/GUI to fs manipulation happens in clientService. This opens up the possibility of a future fs manipulation language if needed, and potentially for the agent to mount a vault on the Unix/Windows filesystem similar to how keybase does things.

This means for now the fs object can be limited in its type interface, but retain the full object capability of an efs instance.

In terms of limiting access to .git..., this would have to be figured out later, our proxies can be tracing the calls and preventing access to it, maybe we can put the .git somewhere else, but not sure what is a good method for this.

Creating a proxy to EFS can be done either by extending the EFS class instance, or by applying an ES6 proxy, in such a case, we are mostly recording what operations are being performed.

Alternatively we don't even use the proxy, but allow any changes to occur, and then use isogit to just capture the changes, and isogit to tell us what files changed and construct the commit message from that.

scottmmorris · 2021-07-23T08:05:29Z

'isomorphic-git` does not allow merge conflicts either, so a simple try and catch statement when pulling has been added.

As far as I can see the only case where merge conflicts occur is when updateSecret is used on ANY file in the vault has been changed and the content is different to the latest commit on the remote.

CMCDragonkai · 2021-07-23T08:08:37Z

What about when you try to pull from a vault that is out of date. Wouldn't that cause a merge conflict. Right now when you have pulled a vault, if you change that vault, then it will be a conflict if pull in from upstream. So we are meant to prevent changes to vaults that are pulled. But we can consider these vaults to be "immutable". Otherwise when changes are applied, a new vault is created with the same contents, and these vaults are mutable. Oh and of course it's not just merge conflicts, we don't want merge commits at all. The entire history should be linearized.

…

On 7/23/21 6:05 PM, scottmmorris wrote: 'isomorphic-git` does not allow merge conflicts either, so a simple try and catch statement when pulling has been added. As far as I can see the only case where merge conflicts occur is when |updateSecret| is used on ANY file in the vault has been changed and the content is different to the latest commit on the remote. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#172 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAE4OHN2A5KVC6OC23AEMR3TZEPFJANCNFSM45VP2A5Q>.

scottmmorris · 2021-07-23T08:15:05Z

Right, that case is just for pulling from the original repository. I'll make more edits once I tests with vaults that are not the original repository

CMCDragonkai · 2021-09-02T04:25:58Z

The challenge of concurrent edits to the vault needs to be considered.

If there are multiple CLI calls or GUI calls or GRPC calls to the same vault, and they are editing different files, the commits should be tracked against the files being edited and we also need to consider globbing here. Then the commits can be serialised one at a time.

We need to ensure that isogit actually properly locks when it commits. If it doesn't, we can use an async-mutex for each vault to ensure that commits are serialised without inconsistency.

This means that each command such as in MatrixAI/Polykey-CLI#32 must keep track of all files and directories edited, and those are the ones added into the commit.

What happens if 2 commands are editing the same file? How is this sorted in GitHub/GitLab wiki? It turns out that each edit to the wiki is a complete update to the file, and there are no merged to concurrent editing. See: https://gitlab.com/gitlab-org/gitlab/-/issues/200002 and https://gitlab.com/gitlab-org/gitlab-foss/-/issues/1827.

echo 'abc abc' | pk secrets write vault1:/abc &
echo 'blah blah' | pk secrets write vault2:/abc &

In the case of PK how is this sorted out? At this point this probably depends on the concurrency of the actual EFS, how the changes are applied there. There may be some clobbering here. But then how does one commit if the file changes are already clobbered and committed in the first case?

One quick way of resolving this is just prevent parallel edits to the same vault. If we add a global lock to the vault entirely and synchronise edits this way. But this seems heavy weight. Let's start with a whole vault lock first, and then let's see what we can do subsequently.

CMCDragonkai · 2021-10-25T13:09:22Z

This was done by merging of vaultsrefactoring https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/205.

Note that making pulled vaults remote should still be tested for. Docs will be MatrixAI/Polykey-Docs#3. @joshuakarp please make sure to indicate any missing tests with regards to this issue.

CMCDragonkai assigned CMCDragonkai and scottmmorris Jul 3, 2021

CMCDragonkai added this to the Polykey CLI 1.0.0 Release milestone Jul 3, 2021

CMCDragonkai removed their assignment Jul 3, 2021

CMCDragonkai added design Requires design development Standard development and removed development Standard development labels Jul 3, 2021

CMCDragonkai mentioned this issue Jul 14, 2021

Timeouts in Testing involving Worker Manager and Async Operations #178

Closed

CMCDragonkai self-assigned this Jul 14, 2021

CMCDragonkai removed their assignment Jul 26, 2021

This was referenced Aug 25, 2021

Third Party Integration - HTTP API, OAuth2 Provider API, Plugin API, Library vs Framework, Smart Tokens #166

Open

Vault merge conflicts and vault pull/clone edge cases #162

Closed

CMCDragonkai mentioned this issue Sep 2, 2021

Incorporate Unix Commands to Secrets Subcommand - pk secrets * MatrixAI/Polykey-CLI#32

Closed

CMCDragonkai closed this as completed Oct 25, 2021

CMCDragonkai added the r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management label Jul 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforce zero merge conflicts #172

Enforce zero merge conflicts #172

joshuakarp commented May 28, 2021 •

edited by CMCDragonkai

Loading

joshuakarp commented May 28, 2021

CMCDragonkai commented Jul 13, 2021

CMCDragonkai commented Jul 13, 2021

CMCDragonkai commented Jul 14, 2021

CMCDragonkai commented Jul 14, 2021

scottmmorris commented Jul 23, 2021

CMCDragonkai commented Jul 23, 2021 via email

scottmmorris commented Jul 23, 2021

CMCDragonkai commented Sep 2, 2021

CMCDragonkai commented Oct 25, 2021

Enforce zero merge conflicts #172

Enforce zero merge conflicts #172

Comments

joshuakarp commented May 28, 2021 • edited by CMCDragonkai Loading

joshuakarp commented May 28, 2021

CMCDragonkai commented Jul 13, 2021

CMCDragonkai commented Jul 13, 2021

CMCDragonkai commented Jul 14, 2021

CMCDragonkai commented Jul 14, 2021

scottmmorris commented Jul 23, 2021

CMCDragonkai commented Jul 23, 2021 via email

scottmmorris commented Jul 23, 2021

CMCDragonkai commented Sep 2, 2021

CMCDragonkai commented Oct 25, 2021

joshuakarp commented May 28, 2021 •

edited by CMCDragonkai

Loading