-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Index format that contains the hashing algorithm information #214
Comments
masih
added a commit
to multiformats/multicodec
that referenced
this issue
Sep 2, 2021
Define a new codec for CARv2 `MultihashIndexSorted`. See: - ipld/go-car#217 - ipld/go-car#214
masih
added a commit
to multiformats/multicodec
that referenced
this issue
Sep 2, 2021
Define a new codec for CARv2 `MultihashIndexSorted`. See: - ipld/go-car#217 - ipld/go-car#214
masih
added a commit
that referenced
this issue
Sep 2, 2021
Implement a new CARv2 index that contains enough information to reconstruct the multihashes of the data payload, since `CarIndexSorted` only includes multihash digests. Note, this index intentionally ignores any given record with `multihash.IDENTITY` CID hash. Add a test that asserts offsets for the same CID across sorted index and new multihash sorted index are consistent. Add tests that assert marshal unmarshalling of the new index type is as expected, and it does not load records with `multihash.IDENTITY` digest. Note, there is a need for a multicodec to be defined for the new index type. For now TODOs are left since it requires coordination across repos. Relates to: - multiformats/multicodec#227 Fixes: - #214
masih
added a commit
that referenced
this issue
Sep 2, 2021
Implement a new CARv2 index that contains enough information to reconstruct the multihashes of the data payload, since `CarIndexSorted` only includes multihash digests. Note, this index intentionally ignores any given record with `multihash.IDENTITY` CID hash. Add a test that asserts offsets for the same CID across sorted index and new multihash sorted index are consistent. Add tests that assert marshal unmarshalling of the new index type is as expected, and it does not load records with `multihash.IDENTITY` digest. Note, there is a need for a multicodec to be defined for the new index type. For now TODOs are left since it requires coordination across repos. Relates to: - multiformats/multicodec#227 Fixes: - #214
masih
added a commit
to multiformats/multicodec
that referenced
this issue
Sep 2, 2021
Define a new codec for CARv2 `MultihashIndexSorted`. See: - ipld/go-car#217 - ipld/go-car#214
masih
added a commit
to multiformats/go-multicodec
that referenced
this issue
Sep 2, 2021
Update submodule to `1bcdc083898abb3e92b132f951e0a2fe0dcd485b`. Run `go generate`. Note, the code generation includes other changes to the codec table that have been merged but not generated here. See: - multiformats/multicodec#227 - multiformats/multicodec@1bcdc08 - ipld/go-car#214
masih
added a commit
to multiformats/go-multicodec
that referenced
this issue
Sep 2, 2021
Update submodule to `1bcdc083898abb3e92b132f951e0a2fe0dcd485b`. Run `go generate`. Note, the code generation includes other changes to the codec table that have been merged but not generated here. See: - multiformats/multicodec#227 - multiformats/multicodec@1bcdc08 - ipld/go-car#214
masih
added a commit
to multiformats/go-multicodec
that referenced
this issue
Sep 2, 2021
Update submodule to `1bcdc083898abb3e92b132f951e0a2fe0dcd485b`. Run `go generate`. Note, the code generation includes other changes to the codec table that have been merged but not generated here. See: - multiformats/multicodec#227 - multiformats/multicodec@1bcdc08 - ipld/go-car#214
masih
added a commit
that referenced
this issue
Sep 2, 2021
Implement a new CARv2 index that contains enough information to reconstruct the multihashes of the data payload, since `CarIndexSorted` only includes multihash digests. Note, this index intentionally ignores any given record with `multihash.IDENTITY` CID hash. Add a test that asserts offsets for the same CID across sorted index and new multihash sorted index are consistent. Add tests that assert marshal unmarshalling of the new index type is as expected, and it does not load records with `multihash.IDENTITY` digest. Relates to: - multiformats/multicodec#227 Fixes: - #214
masih
added a commit
that referenced
this issue
Sep 7, 2021
Implement a new CARv2 index that contains enough information to reconstruct the multihashes of the data payload, since `CarIndexSorted` only includes multihash digests. The new index builds on top of the existing `IndexSorted` by adding an additional layer of grouping the multi-width indices by their multihash code. Note, this index intentionally ignores any given record with `multihash.IDENTITY` CID hash. Add a test that asserts offsets for the same CID across sorted index and new multihash sorted index are consistent. Add tests that assert marshal unmarshalling of the new index type is as expected, and it does not load records with `multihash.IDENTITY` digest. Relates to: - multiformats/multicodec#227 Fixes: - #214
masih
added a commit
that referenced
this issue
Sep 7, 2021
Implement a new CARv2 index that contains enough information to reconstruct the multihashes of the data payload, since `CarIndexSorted` only includes multihash digests. The new index builds on top of the existing `IndexSorted` by adding an additional layer of grouping the multi-width indices by their multihash code. Note, this index intentionally ignores any given record with `multihash.IDENTITY` CID hash. Add a test that asserts offsets for the same CID across sorted index and new multihash sorted index are consistent. Add tests that assert marshal unmarshalling of the new index type is as expected, and it does not load records with `multihash.IDENTITY` digest. Relates to: - multiformats/multicodec#227 Fixes: - #214
Jorropo
pushed a commit
to ipfs/boxo
that referenced
this issue
Mar 22, 2023
Implement a new CARv2 index that contains enough information to reconstruct the multihashes of the data payload, since `CarIndexSorted` only includes multihash digests. The new index builds on top of the existing `IndexSorted` by adding an additional layer of grouping the multi-width indices by their multihash code. Note, this index intentionally ignores any given record with `multihash.IDENTITY` CID hash. Add a test that asserts offsets for the same CID across sorted index and new multihash sorted index are consistent. Add tests that assert marshal unmarshalling of the new index type is as expected, and it does not load records with `multihash.IDENTITY` digest. Relates to: - multiformats/multicodec#227 Fixes: - ipld/go-car#214 This commit was moved from ipld/go-car@42b9e28
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The current
IndexSorted
format contains the digest of the CIDs within in a CAR and offset to where the data is located. This format does not contain enough information to reconstruct list of multihashes in the CAR from index alone, because it does not store the hash function used to generate the digests.Data providers to indexer nodes should ideally be able to use the CAR index alone to supply the list of CIDs in a CAR without having to scan the entire CAR file. When data providers run as part of FileCoin miners, there will be indeed cases where the only accessible information about a CAR is its detached index.
Since for indexing purposes, the codec information in CID is ignored (see ipfs/kubo#6815) the CAR index should at least expose the hash function used to generate the digests within the index. This way the data providers are able to re-construct the list of CAR multihashes.
The text was updated successfully, but these errors were encountered: