support enumerating known codes #58

mvdan · 2021-11-10T16:51:54Z

The table has four columns, ignoring the human-readable description:

Name
Tag (multihash/ipld/multiaddr/serialization/etc)
Code (i.e. integer value)
Status (draft/permanent)

Downstreams like go-ipfs want to be able to enumerate some of these known codes. For example, to tell the user what ipld codecs are supported, they'd rather not hard-code this list as it may grow over time.

One option is to indeed tell downstreams to hard-code the sets of multicodecs they support. This can be reasonable if the list is small or won't change much over time, and especially if explicit support needs to be added for more multicodecs, e.g. by implementing more IPLD codecs with their Encode/Decode interface.

Another option is to expose an API here. The internal mechanics could be code-generated, so they don't worry me. The API is the trickier bit. Below are some options:

Exported lists, such as var TagMultihash = []Code{...}. A set of these lists for Tag, and perhaps another for Status.
A single exported list with all fields, such as:

type TaggedCode struct {
    Code // inherits String method
    Tag string // or perhaps an enum-like integer
    Status string // or perhaps "Draft bool"
}

var AllCodes = []TaggedCode{...}

If the user wants a filter in this scenario, such as by Tag, they would iterate over the list and filter as necessary.

A query-like API, such as:

func ListCodes(byTag, byStatus string, fn func(Code) bool) { ... }

Internally, this would have to use a mechanism like option 1 or 2, so it doesn't actually help a ton. Another major drawback is you'd have to iterate over the entire list to know the number of them.

--

I think that, if we want to do this, options 1 or 2 seem best. I lean towards a minimal version of number 1 - expose slice variables for each of the tags, as that seems to be what the vast majority of users will want to filter by.

The text was updated successfully, but these errors were encountered:

mvdan · 2021-11-10T16:54:05Z

The only major reason I can think of not to do this, beyond a potential "downstreams should only list the multicodecs they explicitly support", is that we don't want to bloat go-multicodec to the point that increasing the size of the CSV table will also significantly increase Go binary sizes.

Assuming we just add global slice variables, though, those should be entirely missing from a linked Go binary as long as nothing uses them.

mvdan · 2021-11-11T14:41:18Z

From @schomatis and @aschmahmann on Slack: they would also like to obtain information about a specific Code, such as whether a user-supplied Code has tag==ipld. So perhaps we can add a Tag() string method to the Code type.

willscott · 2021-11-11T15:46:47Z

This does get embedded in a bunch of places. Would be nice to retain a way that doesn't grow larger if possible.
I am in support of being able to enumerate codes.

mvdan · 2021-11-11T15:53:21Z

@willscott keeping the list of all codes (be it in total, or by tag) necessarily requires keeping that list somewhere, so I'm not sure we can work around the binary/memory size growing larger if the list is actually used.

So I'm not sure what you mean when you say you want to be able to enumerate the codes, but also don't want your program to grow larger over time :)

The factor of growth should be tiny if we just keep very little information for each entry, though.

willscott · 2021-11-11T15:56:38Z

(that i don't want to also include the tag-based lookup)

mvdan · 2021-11-11T16:07:31Z

Perhaps a bit of both. How about:

func KnownCodes() []Code
// a func gives us room to tightly pack the list in the generated code, and expand it to a flat []Code on first use

func (Code) Tag() string // or perhaps an enum? don't think it really matters

Then, if someone wants to filter, they can loop and filter themselves.

Right now, this is simply backed by a code-generated slice, but the API being a function gives us some wiggle room in the future. The test simply ensures the list is reasonably sane; that it has many codes, no unexpected duplicates, and that a few known ones are present in it. Updates #58.

mvdan mentioned this issue Nov 10, 2021

Use standard IPLD codec names across the CLI/HTTP API ipfs/kubo#8471

Closed

mvdan mentioned this issue Nov 11, 2021

split tag=ipld into tag=cid and tag=ipldcodec multiformats/multicodec#242

Closed

mvdan mentioned this issue Nov 23, 2021

tidy code generation, add KnownCodes, add Code.Tag method #59

Merged

mvdan closed this as completed in 1418219 Dec 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support enumerating known codes #58

support enumerating known codes #58

mvdan commented Nov 10, 2021 •

edited

Loading

mvdan commented Nov 10, 2021

mvdan commented Nov 11, 2021

willscott commented Nov 11, 2021

mvdan commented Nov 11, 2021

willscott commented Nov 11, 2021

mvdan commented Nov 11, 2021 •

edited

Loading

support enumerating known codes #58

support enumerating known codes #58

Comments

mvdan commented Nov 10, 2021 • edited Loading

mvdan commented Nov 10, 2021

mvdan commented Nov 11, 2021

willscott commented Nov 11, 2021

mvdan commented Nov 11, 2021

willscott commented Nov 11, 2021

mvdan commented Nov 11, 2021 • edited Loading

mvdan commented Nov 10, 2021 •

edited

Loading

mvdan commented Nov 11, 2021 •

edited

Loading