Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document data versioning and updates on IPFS #27

Open
flyingzumwalt opened this issue Feb 9, 2017 · 2 comments
Open

Document data versioning and updates on IPFS #27

flyingzumwalt opened this issue Feb 9, 2017 · 2 comments
Labels
developers status/ready Ready to be worked topic/docs Topic docs

Comments

@flyingzumwalt
Copy link
Contributor

flyingzumwalt commented Feb 9, 2017

Document the possible patterns for tracking Versions, their tradeoffs, and their relevance for browsers

prerequisite for #10

Reference Notes

Scattered notes from a conversation between @flyingzumwalt and @jbenet

Versioning is surprisingly tricky, mainly because you need different versioning models to suit different uses.

Why we delayed this work: We were waiting until we got IPLD transformations right (related to the Solifying IPLD Sprint)

Factors to consider

  • fast access to pieces
  • retrievability of registries
  • consistency
  • security & authenticity

A normal key-value system just requires that I be able to get the info to you somehow, but challenge is doing that over a distributed network (see pubsub, etc). If you want high consistency on these names (key-value pairs), you need a secure communication channel for announcements. Most extreme version of that is ethereum. Least reliable (by design) is gossip protocol.

  • SLEEP uses one versioning model that suits some use cases. It's git-style, but not a straight implementation of the git model
    • current implementation is not actually secure -- security guarantees for doing things like registries are not sound
  • git versioning model is another viable model
  • CRDTs work is more interesting for a lot of use cases, especially anything involving dynamic concurrent updates to a dataset/"database"
  • One way to provide high consistency guarantee: ipns name updater on ethereum (or general name aggregator chain)
    • another way to think of this: using ethereum as a libp2p record store -- this warrants clarification
    • provides high consistency guarantee & retrievable registries
    • would address concerns about relying on IPNS exclusively over DHT, which allows for situations where you won't be able to find the records/registry depending on condition of the network.
    • Question: If you're writing the registries to ethereum, why use IPNS at all?
      • Answer: IPNS gives you a consistent way to do naming in ipfs-land, ethereum/dns/etc gives you different consistency guarantees
@ghost
Copy link

ghost commented Apr 18, 2017

Moving here from #10:

@Gozala

Some of the ideas we (me and Patryk) being exploring seem to assume that there is a way to see an every version of the IPFS content. In other words it would be nice to have a changelog for IPNS up2017-02-01s so it’s not just here is what the current version is, but also here all the previous version that existed. I remember mention of the commit objects in white paper so I assume there is a way to up2017-02-01 IPNS pointer with a commit object, but I can’t really figure out how or if I’m actually getting it right. I think something along the lines of http://docs.datproject.org/sleep is what I’m looking for.

@jbenet:

Yes, we have given this a lot of thought, and are returning to it this and next quarter. it's not easy to get this right because what we choose can block many applications. Meaning that "one fully-contained versioning strategy" works for 20% of use cases we've looked at at most. One clear example is that data-center applications that expect to mutate names on the order of <1ms will want something that works a bit differently than apps that require much stronger security (eg censorship resistance that requires timestamp to the bitcoin blockchain, DNS, or some equivalent level of security) but can tolerate only changing names on the order of <100s (like most DNS names).

This actually decomposes to two different problems:

  • How apps want to do versioning (security, consistency, and dev UX, implications):

    • Commit graphs (like git)
    • Commutable patches (like darcs)
    • CRDTs (riak, orbit, google docs, google internals, the future)
    • consensus (blockchains, etc).
  • How apps want to do naming (security, consistency, dev UX, and ownership implications):

    • slow public key (ipns, sfs) with consensus (strong consistency, >10s up2017-02-01s, available only in some consensus model)
    • fast public key (ipns, sfs) without consensus (weaker consistency, >1us up2017-02-01s, available disconnected networks -- dhts, pubsub, etc)
    • DNS naming (strong consistency, >60s up2017-02-01s)
    • blockchain naming (ENS, blockstack, etc).

In our research, lots of apps DO NOT want to manage their versions manually, want convergent replication, and should be using CRDTs and things like orbit-db. Some subset DO want direct control over versioning and want commit graphs (like git, dat, etc), so for those we will expose direct versioning logs that can be indexed in a couple of good ways. (eg binomial heaps, etc)

BTW, i think the big hump that we need to communicate better is the transition from "apps store data in files" to "apps store data directly, can build files out of data", and the IPFS name isn't helping a ton here.

@ghost ghost added the ux label Apr 18, 2017
@ghost ghost changed the title Document the possible patterns for tracking Versions with IPFS Document versioning and updates on IPFS Apr 18, 2017
@ghost ghost changed the title Document versioning and updates on IPFS Document data versioning and updates on IPFS Apr 18, 2017
@ghost ghost added developers and removed ux labels Apr 18, 2017
@JustinDrake
Copy link

I'm interested in following the versioning discussion. Below are the kind of versioning things that one could want to do with OpenBazaar. Each OpenBazaar vendor publishes a Unixfs root folder which holds the current public store assets, and different files have different semantics (e.g. some files represent listings, one file represents the profile, one file represents the listing catalogue). Any edit to the store publishes a new IPNS entry with a new root hash.

  1. Versioning of stores at the IPNS level. Stores generally have a single owner/committer and I expect stores to mostly move forward linearly (with the occasional roll-back, and branching out). As I see it, we simply need every IPNS entry to point to the root hash of the previous IPNS entry (in addition to incrementing the IPNS sequence). OpenBazaar "archival nodes" (e.g. one run by Duo) could show past versions of stores, in a linear fashion similar to archive.org (most likely), or in a tree-like fashion similar to GitHub (may not be necessary).
  2. Versioning of individual documents, notably listings. Each OpenBazaar listing is editable (e.g. to update the price, add a tag, edit the description), but having a cryptographic history for each listing is valuable. For example, we want to be able to aggregate ratings for a given listing over its lifetime, not on a per-edit basis. Having listings contain the hash of the previous listing version at the application level is easily done, but having a standard for versioning at the Unixfs level may be preferable to a home-rolled solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
developers status/ready Ready to be worked topic/docs Topic docs
Projects
None yet
Development

No branches or pull requests

3 participants