-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Change from protobuf to cbor for IPLD #2604
refactor: Change from protobuf to cbor for IPLD #2604
Conversation
e08fe9a
to
e9a6d3a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good Fred, thanks :) Code is much nicer. I only have a handful of small comments for you :)
// Priority is used to determine the order in which concurrent updates are applied. | ||
Priority uint64 | ||
// IsCreate is true if this update is the creation of a new document. | ||
IsCreate bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Thanks for simplifying this :)
question: Is this going to work long term? Is there still a priority encoded within the Block
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the priority is in the block itself and it's not used while syncing. Only for merging.
internal/core/block/block.go
Outdated
} | ||
|
||
func mustSetSchema() schema.Type { | ||
ts, err := ipld.LoadSchemaBytes([]byte(` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: I can see this getting out of sync quite easily, and whilst that will probably get flagged pretty quickly by tests, it may not be obvious what the error is to someone who doesnt work much in this space (and it is better that they dont mess up in the first place).
Particularly problematic may be the FooDelta
s, as they don't live in this package even.
I have two suggestions to improve this, firstly that this string is broken up, so that the types define their respective schema (making the string and type closer in proximity) - mustSetSchema
then will concat all the strings required, the strings could be consts, or they could perhaps be consts within a function on a new interface implemented by each type.
Secondly, please document that this string exists, on the types that are defined within it, including the ones in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
e9a6d3a
to
6f7d5c0
Compare
reviewing now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one tiny todo in the PR, and another possible small one messaged over discord :)
// IPLDSchemaBytes returns the IPLD schema representation for the DAGLink. | ||
// | ||
// This needs to match the [DAGLink] struct or [mustSetSchema] will panic on init. | ||
func (l DAGLink) IPLDSchemaBytes() []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Thanks for moving this Fred, I like the new way of organising these :)
todo: Please add a var _ schemaDefinition = (*DAGLink)(nil)
line, and similar for all relevant types. If nothing else it is more useful (and safer) documentation than the comment IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already taken care of by a compile time check on line 37 of block.go
. No need to worry about doing that for every type individually. It will automatically be taken care of for every new type that we add to mustSetSchema
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall, big fan of these changes, and even moreso knowing what your next steps are, especially with the net
package.
Couple of things ive noted in this PR, nothing giant, but one or two notable items.
// This needs to match the [Block] struct or [mustSetSchema] will panic on init. | ||
func (b Block) IPLDSchemaBytes() []byte { | ||
return []byte(` | ||
type Block struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion/question: This might be a good time to add a block format version number to the blocks, for future upgradeability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is worth a large discussion likely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be simpler for the version to be specified as a CID multicodec. I think its worth a larger discussion like you said.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue created: #2622
We can figure this out before the end of the cycle. Let's not block this PR for this.
// This needs to match the [CompositeDAGDelta] struct or [coreblock.mustSetSchema] will panic on init. | ||
func (delta *CompositeDAGDelta) IPLDSchemaBytes() []byte { | ||
return []byte(` | ||
type CompositeDAGDelta struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: Im not the biggest fan of these IPLD schema definitions defined inline like this. Its not a hug problem.
One possibility is to define the IPLD schemas in their own files, similar to a .proto
file, and use the go embed directives to have access to the bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's a small amount of code/text, I find it more convenient to have it right there as you read the Go code. I don't see the benefit of separating them into their own files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to split them into their own files we'd probably want to force a one-one file correspondence with the Go code too otherwise which .proto
file goes with which .go
file would probably be a bit confusing - and I'm not sure we'd want to force that on the Go code, and splitting the go files would be better done in another PR anyway (I've not checked to be sure, but I don't think they are one type per file atm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are one type per file :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its just A) Somewhat ugly B) not well structured, and doesn't lend itself to working with all the IPLD schemas independent of the delta implementations.
Not necessarily about the amount of code the schemas create.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A) That is subjective.
B.1) It's well structured in the sense that it is right next to the Go type that represents it.
B.2) Why would we want to use them independently of the delta structs? They are merely a representation of the same thing. The IPLD schema is used to help translate from an IPLD node representation to the corresponding Go struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A) absolutely
B1) I meant not well structured as it's just some arbitrary inline byte string inside a method
B2) Now that we have dedicated IPLD schemas defined, it opens up a whole world of things we can do with them. Eg generate non Go serialization bindings, use the schemas in docs, can have util functions that operate on the schemas themselves, etc.
The reason I made the proto analogy, is that much like protobuf it actually has nothing to do with Go. IPLD schemas are general and language independent, and we're losing out on some potential functionally to have them inline like this.
This isn't a PR blocker, just a general design point for this new concept being brought into the codebase with this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
B2) Ah!. The usage outside of Go entirely is something I wasn't considering in this conversation. That is not what I interpreted when you said "independent of the delta implementation". That merits a fair amount of weight in how we decide to structure things. Added an issue to track this: #2623
@@ -247,6 +250,27 @@ func TestDocQueue(t *testing.T) { | |||
q.mu.Unlock() | |||
} | |||
|
|||
func getHead(ctx context.Context, db client.DB, docID client.DocID) (cid.Cid, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: We shouldn't have a duplicate getHead
function, with its own implementation, since there is already all this functionality within the merkle/clock
package to be able to get a specific head CID of a document.
Its not the nicest to use, but if there is going to be some util function that takes a store and a docID and returns the head CID, it should be along side the current implementation in the other package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: We also probably don't need the updated Headstore
method on the client.DB
interface, since the datastore.Multistore
already has this method, and the entire client.DB
interface shouldn't be a argument to this function just to get a head CID of a document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
net
package test code should not be referencing the internal/merkle/clock
package. If I had to chose between duplicating code and referencing an internal method, I would overwhelmingly support duplicating the code (or just don't bother and delete the unit tests...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few problems here. One is like Andy mentioned, needing to deal with internal/merkle/clock
for net
tests is not ideal. Also, before this PR, the Headstore
was only available through MultiStore
with an explicit txn
which those test don't use. Saying the entire client.DB
shouldn't be an argument to this function is correct and that is something I want to fix in a lot of places in our codebase. We often use large interfaces when a locally specific one would be more appropriate.
I'd rather not get into this within this PR but I will try to improve the Headstore
utilities shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue created: #2621
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usage within a test or not doesn't make much difference here. This function is reimplementing a somewhat sensitive flow and is relying on the behavior of getHead to be inline with the DB.
If somewhat makes changes to the actual headset or head store for whatever reason, it's not necessarily going to be clear when they run the tests and the net package tests are failing.
The fact that the merkleclock is internal or not also has no bearing.
We have a specific implementation, that is rather core to the entire correctness of the DB with the headset implementation, and that behavior must be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agree that it needs to be consistant. This will be fixed in a subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yupp perfect
(Note my comment was primarily in response to that it's a test and it's ok to duplicate the functionality, not the need for it to be in the pr)
Cheers
7ef1e57
to
54a3f05
Compare
54a3f05
to
affd3a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm thanks for extracting the follow up items from my review
Relevant issue(s)
Resolves #2603
Description
This PR changes the encoding of the blocks in the blockstore to use cbor instead of protobof. As part of this, we move away from encoding/decoding individual deltas (no more Marshalling and Unmarshalling deltas) and instead, encode/decode at the block level. To do this, we use the new
github.com/ipld/go-ipld-prime
package that uses IPLD schemas to map the encoded blocks to the appropriate fields and CRDT types. The newcore/block
package handles that part of the change.The
net
package is also affected by that change with theDAGSyncer
being replaced by the newipld/linking.LinkSystem
. I expect further usage of theLinkSystem
in a couple subsequent PRs.Tasks
How has this been tested?
make test
Specify the platform(s) on which this was tested: