Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concatenated upload index of all CAR indexes for a root CID #49

Open
olizilla opened this issue Mar 15, 2023 · 5 comments
Open

Concatenated upload index of all CAR indexes for a root CID #49

olizilla opened this issue Mar 15, 2023 · 5 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature need/triage Needs initial labeling and prioritization

Comments

@olizilla
Copy link

We hit issues where users send us a dag split over > 1000 CARs as we have to load a CAR index for each CAR before we can figure out where to fetch blocks from. If we create an upload index file for the root CID, as the concatenation of each CAR index, we only have to fetch a single file before we can start responding. I think this would solve the issue we're seeing #46

There remain edge cases where a single file is split over > 1000 CARs, but either the user is sending us CAR shards that are too small, or the file is massive. For example if users stick with the 100MiB CAR shard size we provide, they'd upload a 32GiB DAG in 328 CARs, so we could tackle that as a lower priority issue.

@olizilla olizilla added kind/enhancement A net-new feature or improvement to an existing feature need/triage Needs initial labeling and prioritization labels Mar 15, 2023
@olizilla
Copy link
Author

As a low risk first pass, we could create the concatenated upload index after each call to upload/add in the upload api. It's fine if the user calls it multiple times and adds CARs incrementally. We can rebuild it from scratch each time, or read any existing one and make it smarter if we need to.

@olizilla
Copy link
Author

could be one for https://github.com/mikeal/multiblock

@alanshaw
Copy link
Member

alanshaw commented Apr 21, 2023

Flag that we'd not be able to simply concat them together because we lose the information about which CAR file the blocks are in (as we'd effectively be creating a single index for multiple CARs). So we'd just need to include CAR CID the block can be found in in the rollup.

@mikeal
Copy link

mikeal commented Apr 21, 2023

+1 if “concatenate” means “write each index as a block in a CAR along with an object that maps the CAR CIDs to each Index CID” ;)

@alanshaw
Copy link
Member

alanshaw commented Jun 5, 2023

There's a new index in town: https://github.com/alanshaw/cardex#multi-index-index

TLDR; it's a CARv2 index which is a list of car-cid,carv2-index pairs.

I'm going to try out rollups using this index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

3 participants