Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serve Car Files via Filecoin Retrieval Protocol #66

Merged
merged 7 commits into from
Oct 21, 2021

Conversation

hannahhoward
Copy link
Collaborator

@hannahhoward hannahhoward commented Oct 20, 2021

Goals

Setup the provider so it can serve car files to retrieval clients

Implementation

  • Implement cardatatransfer, a module that serves files via Filecoin retrieval protocol (always free) on top of data transfer & graphsync.
    • copy a lot of types over from markets
    • had to bring in cbor-gen unfortunately, bindnode is not quite ready to work with markets types. cc @mvdan for interesting challenges -- the biggest missing components I could see immediately is the lack of support for nillable struct values via pointer types, and the lack of an Any type in the current schema type list.
    • cardatatransfer is backed by a blockstore supplier, which is the CarSupplier
  • Refactor CarSupplier to remove dual put methods, reverse lookup based on path, and simply use contextID for all methods (the backing reason is we really need to do the hashing of filepaths for implicit keys outside of this module, as we need them to generate metadata for Filecoin)
  • import handler now just always assumes a key
  • import command now autogenerates a key
  • if no metadata is present, takes the specified or generated key and makes Filecoin V1 metadata for it so you can use with car supplier
  • added a testutil directory to start sharing commonly used functions

For discussion

  • the command syntax for import is very much a guess as to what we want -- definitely open to feedback and revision
  • you can largely gloss over the retrieval_types.go and the stores implementation -- these are copy pasted from markets
  • I did not add a retrieve CLI command cause it seems odd to spin up a new libp2p host just to sub in for the future retrieval client. but we could.

add a module to serve car files over data transfer using retrieval vouchers
modify car supplier to serve blockstores, modify import handler to generate metadata, create end to
end test
reinstate metadata and update sample car files
@codecov-commenter
Copy link

codecov-commenter commented Oct 20, 2021

Codecov Report

Merging #66 (fdd9113) into main (39604c8) will increase coverage by 0.99%.
The diff coverage is 44.59%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #66      +/-   ##
==========================================
+ Coverage   37.84%   38.84%   +0.99%     
==========================================
  Files          26       30       +4     
  Lines        1403     2003     +600     
==========================================
+ Hits          531      778     +247     
- Misses        807     1049     +242     
- Partials       65      176     +111     
Impacted Files Coverage Δ
cmd/provider/daemon.go 0.00% <0.00%> (ø)
...nternal/cardatatransfer/retrievaltypes_cbor_gen.go 32.32% <32.32%> (ø)
internal/cardatatransfer/cardatatransfer.go 60.68% <60.68%> (ø)
internal/suppliers/car_supplier.go 52.00% <61.90%> (-7.38%) ⬇️
internal/cardatatransfer/stores/stores.go 82.35% <82.35%> (ø)
cmd/provider/import.go 52.94% <86.66%> (+3.81%) ⬆️
internal/cardatatransfer/retrievaltypes.go 100.00% <100.00%> (ø)
server/admin/http/importcar_handler.go 80.00% <100.00%> (+7.77%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 39604c8...fdd9113. Read the comment docs.

Copy link
Member

@willscott willscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

store := dssync.MutexWrap(datastore.NewMapDatastore())
blockStore := blockstore.NewBlockstore(store)
lsys := storeutil.LinkSystemForBlockstore(blockStore)
h, err := libp2p.New(context.Background(), libp2p.ListenAddrStrings("/ip4/0.0.0.0/tcp/1"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be /0?

@rvagg
Copy link
Contributor

rvagg commented Oct 20, 2021

Looks like cbor-gen is being used for maps here, which has its own naive sorting, might not be a good idea (might also be a good chance to help push whyrusleeping/cbor-gen#56 over the line?).
Is it possible to swap maps out for lists? They're much more compact and you get to ignore issues of canonicity since it's always ordered according to your original structure (and why the filecoin chain uses them exclusively).

@mvdan
Copy link
Contributor

mvdan commented Oct 20, 2021

  • had to bring in cbor-gen unfortunately, bindnode is not quite ready to work with markets types. cc @mvdan for interesting challenges -- the biggest missing components I could see immediately is the lack of support for nillable struct values via pointer types, and the lack of an Any type in the current schema type list.

Thanks for the update. Did you write the schema file anywhere, so I can look into those? Or were you inferring the schema from the Go types alone, like cbor-gen?

@hannahhoward
Copy link
Collaborator Author

@rvagg we're going for 1:1 exact CBOR compatibility with go-fil-markets retrieval types, so things like switching from map encoding to list encoding is not an option. That said, if you want to advocate for pushing cbor-gen to do map ordering PR and use this as an example, happy to push on that.

In the meantime they have to remain the same for wire compatibility.

@hannahhoward
Copy link
Collaborator Author

@mvdan there is, as you may have guessed, no schema. These are just the go types as defined by go-fil-markets and encoded with cbor-gen.

However, I can give you the specific examples that will hopefully shed light on what's happening here.

First, often in cbor-gen, pointer types are effectively used to encode optional fields (the obviously conflicts with go-ipld-prime that seperates optional and nullable... not may favorite aspect of the schema design). So for example PieceCID in a retrieval deal proposal is a pointer to a cid, meaning the field is essentially optional. I'd just be happy if we could convert those to nillable fields in a schema and have them work right.

Second, you can see we use cbg.Deferred, a really weird type, for selectors. Basically selectors are ipld.Nodes, and I'd like to have a struct member for a selector that's an ipld.Node and then have a schema with type Any. I guess actually technically selectors have a schema so maybe we could just encode a type schema for them? In case you're wondering, cbg.Deferred is a really weird cbor-gen feature that basically saids "when you get to this field, don't decode it -- just read all the raw byte contents and put them in the field for later decoding" -- it results in some pretty weird code any time you generate retrieval proposals

@hannahhoward hannahhoward merged commit dca0572 into main Oct 21, 2021
@hannahhoward hannahhoward deleted the feat/serve-car-files branch October 21, 2021 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants