-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add collectionId field to commit field #1235
feat: Add collectionId field to commit field #1235
Conversation
Benchmark ResultsSummary
✅ See Better Results...
❌ See Worse Results...
✨ See Unchanged Results...
🐋 See Full Results... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a minor nitpick before merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
Codecov Report
@@ Coverage Diff @@
## develop #1235 +/- ##
===========================================
- Coverage 70.71% 70.68% -0.04%
===========================================
Files 182 182
Lines 17206 17225 +19
===========================================
+ Hits 12167 12175 +8
- Misses 4109 4119 +10
- Partials 930 931 +1
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some non-blocking comments, looks good.
client/request/consts.go
Outdated
@@ -47,6 +47,7 @@ const ( | |||
HeightFieldName = "height" | |||
CidFieldName = "cid" | |||
DockeyFieldName = "dockey" | |||
CollectionIDFieldName = "collectionId" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: @fredcarle are go gods happy with "collectionId"
over "collectionID"
? I see we have "schemaVersionId"
before (but this is a string not a variable name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about that one. I'm not sure what is better in this case because that string representation is how it is displayed and used in GraphQL. It depends if we want to apply Go like formatting in the GraphQL representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
misread nevermind :)schemaVersionId
will be my fault - I regularly forget to uppercase acronyms :P It should be schemaVersionID
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is what GraphQL style guides seems to be:
- Field names should use camelCase. Many GraphQL clients are written in JavaScript, Java, Kotlin, or Swift, all of which recommend camelCase for variable names.
- Type names should use PascalCase. This matches how classes are defined in the languages mentioned above.
- Enum names should use PascalCase.
- Enum values should use ALL_CAPS, because they are similar to constants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Go representation would fit this guide with the difference of acronyms using uppercase. We could be consistent and apply that everywhere so the string representation would become "collectionID". It might be less confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: Change to "collectionID" then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theres a problem with the approach here. Based on the #891 PR, which was based on previous work, we shouldn't be persisting the entire DatastoreKey
into the DAG. That issue can be solved separately from this issue in a follow-up PR (more CID changes 😂 yay)
What needs to change in this PR is how we get the CollectionID. At the moment, the CollectionID
is a "local" item, compared to something like the dockey
or schemaVersionID
which is a "global" item.
The difference is that since this is a Peer-to-Peer database, anything that exists in one DB locally, can potentially be replicated to any other DB globally, so we need to keep that in mind when making changes, what state is local to the node and can be changed freely, and what state is global to the network.
The reason DocKeys and SchemaVersionIDs are global is that they are based on the CID system, which is a global namespace since its effectively just a hash.
The CollectionID is just a local sequence number starting at 0 and incrementing for each collection that gets added. It isn't safe to be used in a global context.
But, all the work in this PR is mostly still necessary, since we do want to expose the CollectionID from the GQL perspective.
So, a nice solution is to omit adding the CollectionID from the DAG, which isn't explicity changed in this PR, but from #891 incorporating the full DatastoreKey
. Since that needs to change (as mentioned in a followup PR), we can still implement the GQL necessary changes without waiting for that change to land.
Basically, instead of getting the CollectionID from the DatastoreKey
from within the DAG, we cna get the schemaVersionID
from within the DAG, and do a lookup for the collection based on the schemaVersionID
. Would require making a change to the client.DB
interface to expose the getCollectionByVersionID
which is currently private on the db
type.
In reality, the short of adding the new public func, the only lines that change from the current implementation is collectionID, err := strconv.Atoi(dockeyObj.CollectionID)
.
cc: @AndrewSisley to make sure I've gotten everything correct, and if he has any objections to exposing GetCollectionByVersionID
on client.DB
.
We previously talked about having a However, we do need the functionality of getting the local |
Is a really good catch, and all looks good and sensible - I strongly agree that this PR should change to fetch it the 'right' way, as we are close to the end of the release cycle and it doesnt feel safe to assume that we can publicly expose this as-is and hope we'll get it working correctly in the meantime. It might also be better to prioritise the removal of collectionID from the DAG before the release, as that is a persisted data corruption of sorts. There is actually a ticket to expose |
Local data cant be allowed into the (global) DAG. When we sort out multiple collections from the same schema, and if then we need to tie the commit to a local collection for this kind of query, we have to do it without storing it in the 'normal' commit block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need a comment to request changes - the reason is RE John's excellent spot
I agree. My comment does not debate that :) |
Ah sorry I thought you were suggesting we store the local collectionID for now and then upgrade it to a global collection id later :) |
I was saying that's what I was thinking about when I reviewed it. The part you highlighted doesn't imply that it has to be on the DAG. In the short term, though, it probably doesn't matter if the local collectionID is on the DAG as most nodes will be created with the same collections. It would be quite easy to have a global collection ID though. It would just the hash of the schemaID plus the collection name. That could easily be done in this PR to replace the local collection ID in the DAG. |
I wouldn't say that is a sufficient global ID. It is in the short term, but im hoping to get away from that. Once digital signatures land, we can have an easier time with a proper global ID. Its possible to use your suggestion in the short term, but I think it needs a bit more discussion. The downside to my current suggestion is that it limits the DBs to one collection per schema, but that is already a limitation, and wont be solved until #1032 and tangential efforts have been solved.
I also disgree here, as im pretty sensitive of what ends up in the DAG. Neither I nor Andy can remember why the full Since we would have to remove it, and break more stuff. The use of
That ticket seems a little more involved, unless im reading it wrong. |
f8956c1
to
8c17b1b
Compare
5c813a9
to
17eb39c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Islam :)
@@ -302,7 +303,16 @@ func (n *dagScanNode) dagBlockToNodeDoc(block blocks.Block) (core.Doc, []*ipld.L | |||
if err != nil { | |||
return core.Doc{}, nil, err | |||
} | |||
n.commitSelect.DocumentMapping.SetFirstOfName(&commit, "dockey", dockeyObj.DocKey) | |||
n.commitSelect.DocumentMapping.SetFirstOfName(&commit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I think people here do prefer the below, instead of the current (no need to change now, but in future PRs consider this):
n.commitSelect.DocumentMapping.SetFirstOfName(
&commit,
request.DockeyFieldName,
dockeyObj.DocKey,
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for accommodating the abrupt requirements change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Add collectionID field to commit Commits can be grouped and ordered by collectionID To retrieve a value for collectionID the method GetCollectionByVersionID is added to db interface.
Add collectionID field to commit Commits can be grouped and ordered by collectionID To retrieve a value for collectionID the method GetCollectionByVersionID is added to db interface.
Relevant issue(s)
Resolves #849
Description
This PR adds a new field "collectionId" to commit field that can be queried now, grouped by and ordered by.
Tasks
How has this been tested?
Integration tests
Specify the platform(s) on which this was tested: