Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing store/* with blob/* #111

Open
Gozala opened this issue Feb 8, 2024 · 4 comments
Open

Replacing store/* with blob/* #111

Gozala opened this issue Feb 8, 2024 · 4 comments

Comments

@Gozala
Copy link
Collaborator

Gozala commented Feb 8, 2024

As we have discovered recently some of our assumptions are misplaced storacha/w3up#1304. Specifically store/add assumes to take CAR cid and gives you back pre-signed upload URL, however nothing in the system verifies that uploaded content was a valid CAR file, we just check that hash digest of the bytes matches hash digest of the file uploaded.

I would like to propose adjusting our assumptions and specifically propose blob/* API protocol that is equivalent of store/* except that it takes multihash as opposed to CAR cid. This would also better align with motivation of not packing files into CARs but rather storing them as is and creating an index separately.

We could deprecate store/* API in favor of blob/* and having it active for some time. Furthermore this would allow us to layer content claims on top, that is if blob is the CAR content claim could be made about that fact.

I would also be happy to the store/* namespace go away since it is not a great name for what it is.

@Gozala
Copy link
Collaborator Author

Gozala commented Feb 8, 2024

I suppose instead of raw hash we could require raw CID instead, although I'm not sure if it would have any benefit over just multihash, perhaps if in some distant future actual stores support other verification methods.

@alanshaw
Copy link
Member

alanshaw commented Feb 9, 2024

Specifically store/add assumes to take CAR cid

Is this right? The spec says it should be a CAR, but implementation and types do not restrict to CAR.

@vasco-santos
Copy link
Contributor

we have things being stored without CAR cid, so I think that we are not restrictive on it both in spec and implementation

@Gozala
Copy link
Collaborator Author

Gozala commented Feb 12, 2024

Is this right? The spec says it should be a CAR, but implementation and types do not restrict to CAR.

Implementation may not restrict it, but it definitely assumes that CARs will be uploaded. In practice however we find that there had been uploads that match the hash but aren't CARs.

It is worth calling out that just ignoring CID prefix is not a right solution, as things will get mislabeled. We really should do one of the following:

  1. Restrict CID prefix to raw blocks only (because we do not really verify that block corresponds to codec prefix)
  2. Switch to multihashes instead of CIDs (as that is something we do verify)
  3. Add verification (so we could enforce that uploaded bytes match the claimed codec)

The third option is a lot more complicated than first two, furthermore we could achieve same effect by uploading arbitrary bytes and then submitting content claim triggering lazy verification. As of first two I have slight preference for 2nd, but can buy into 1st if good argument is presented in it's favor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants