Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add PatchCollection #2402

Merged
merged 9 commits into from
Mar 14, 2024

Conversation

AndrewSisley
Copy link
Contributor

Relevant issue(s)

Resolves #2389

Description

Adds the PatchCollection command.

Mutating anything but the collection name is currently disabled, we can expand this as we see fit, but for now I'd prefer to keep the initial PR small.

This change means that the Collection Name is no longer always going to be the same as the Schema Name.

I've manually tested the OpenApi stuff via playground.

This was in the wrong place, the folders are named after patch actions, and 'index' is not a patch action
@AndrewSisley AndrewSisley added feature New feature or request area/collections Related to the collections system labels Mar 11, 2024
@AndrewSisley AndrewSisley requested a review from a team March 11, 2024 19:41
@AndrewSisley AndrewSisley self-assigned this Mar 11, 2024
@AndrewSisley AndrewSisley added this to the DefraDB v0.11 milestone Mar 11, 2024
Copy link

codecov bot commented Mar 11, 2024

Codecov Report

Attention: Patch coverage is 81.13208% with 60 lines in your changes are missing coverage. Please review.

Project coverage is 75.13%. Comparing base (d524320) to head (6336395).

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2402      +/-   ##
===========================================
+ Coverage    74.99%   75.13%   +0.14%     
===========================================
  Files          268      269       +1     
  Lines        26017    26334     +317     
===========================================
+ Hits         19511    19785     +274     
- Misses        5181     5213      +32     
- Partials      1325     1336      +11     
Flag Coverage Δ
all-tests 75.13% <81.13%> (+0.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
cli/cli.go 100.00% <100.00%> (ø)
cli/schema_patch.go 66.18% <100.00%> (ø)
db/errors.go 67.53% <100.00%> (+6.44%) ⬆️
http/handler_store.go 84.25% <85.19%> (+0.05%) ⬆️
db/txn_db.go 63.37% <64.29%> (+0.07%) ⬆️
http/client.go 52.88% <53.85%> (+0.04%) ⬆️
cli/collection_patch.go 68.18% <68.18%> (ø)
db/description/collection.go 51.11% <44.44%> (-1.18%) ⬇️
db/collection.go 73.23% <88.57%> (+2.16%) ⬆️

... and 9 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d524320...6336395. Read the comment docs.

@@ -0,0 +1,68 @@
// Copyright 2023 Democratized Data Foundation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo:

Suggested change
// Copyright 2023 Democratized Data Foundation
// Copyright 2024 Democratized Data Foundation

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😁 Will do

  • 2024

Comment on lines 57 to 58
case len(args) >= 1:
patch = args[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: In what case can len(args) > 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really care :) But I've added a param limit, arg length is now 0 or 1.

//
// It will also update the GQL types used by the query system. It will error and not apply any of the
// requested, valid updates should the net result of the patch result in an invalid state. The
// individual operations defined in the patch do not need to result in a valid state, only the net result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: what is meant by valid state here.

Suggested change
// individual operations defined in the patch do not need to result in a valid state, only the net result
// individual operations defined in the patch do not need to result in a `valid state`, only the net result

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A valid CollectionDescription mutation, at the moment that means one that only changes the name (or tests, or does nothing).

Comment on lines +126 to +127
// It will also update the GQL types used by the query system. It will error and not apply any of the
// requested, valid updates should the net result of the patch result in an invalid state. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: This sentence reads very odd to me "It will error and not apply any of the requested, valid updates should the net result of the patch result in an invalid state."

Please reword.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is odd about it? Do you have a rough alternative in mind?

I think it is important to highlight that the full patch will be rolled back if an given patch item fails - especially given that SQLs do not do this and require manual rollback when given multiple DDL statements.

Copy link
Member

@shahzadlone shahzadlone Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds more clear when you said: full patch will be rolled back if an given patch item fails

This is very wordy: ... apply any of the requested, valid updates should the net result...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rolled back is incorrect/lazy lol - the patches are aggregated before being applied - hence the use of net result of the patch

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The individual operations defined in the patch do not need to result in a valid state, only the net result of the full patch.

I though I understood the bahaviour from the prior sentence but this one gets me confused. How can the net result be valid if an individual operation creates an invalid state?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the patches are aggregated before being applied

This will be nice to document somewhere too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though I understood the bahaviour from the prior sentence but this one gets me confused. How can the net result be valid if an individual operation creates an invalid state?

There are many reasons why we want this to be, and many of the tests cover this, for example:

{ "op": "copy", "from": "/Users/Fields/1", "path": "/Users/Fields/2" },
{ "op": "replace", "path": "/Users/Fields/2/Name", "value": "age" },

The first field is copied (invalid as it is now a duplicate), and then renamed (net is now valid). This allows existing stuff (including full schema/collections) to be used as templates. There are other use cases too.

This will be nice to document somewhere too

It is, we are talking about the documentation that currently documents it.

"strconv"
"strings"

jsonpatch "github.com/evanphx/json-patch/v5"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: I forgot if we cared about camel casing package import alias names. If we do then jsonPatch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know either, my IDE did this, and I don't see it as being unreadable

@@ -526,6 +529,283 @@ func validateUpdateSchemaFields(
return hasChanged, nil
}

func (db *db) patchCollection(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I would much rather than patch stuff be moved to a new file, like: db/collection_patch.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of core functions to mutating schema and collections currently live in this file, I agree that it should be reorganized, but not in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened a ticket: #2407

continue
}

// DeepEqual is temporary, as this validation is temporary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why is this temporary? Will the use of reflection package be removed?

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation (function) is temporary, which means that if this function is removed, the call to DeepEqual will be removed with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation (function) is temporary, which means that if this function is removed, the call to DeepEqual will be removed with it.

Why is it temporary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the function is validating that indexes have not been modified, and we very much want them to be mutable in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks that makes so much sense.

Please document this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document this

It is.

db/collection.go Outdated
for _, oldCol := range oldColsByID {
for _, newCol := range newColsByID {
// It is not enough to just match by the map index, in case the index does not pair
// up with the ID (this can happen if a user moves it)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: moves it moves what?

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collection relative to it's index in the map, I'll replace it

  • Replace it

{ "op": "add", "path": "/2", "value": {"Name": "Dogs"} }
]
`,
ExpectedError: "collection ID cannot be zero",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would add another test where it is explicitly 0:
{ "op": "add", "path": "/2", "value": {"ID": 0, "Name": "Dogs"} }

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add

  • Add explicit zero add test

// If a value is not provided the patch will be applied to all nodes.
NodeID immutable.Option[int]

Patch string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: document please

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad :) Of course :)

  • Doc test action Patch prop

Copy link
Member

@nasdf nasdf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor fix from me.

return store.PatchCollection(cmd.Context(), patch)
},
}
cmd.Flags().StringVarP(&patchFile, "patch-file", "p", "", "File to load a patch from")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: This flag doesn't match the example above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, copy-pasted issue from PatchSchema, I've fixed both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't all other file consumption commands use --file |-f why do we want to have a different flag name for this arg (does it clash with -f somewhere?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PatchSchema uses p because it it takes two kinds of files. I think it is more important to be consistent with PatchSchema than anywhere else here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it take two files? I see 1 so far with -p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PatchSchema takes two files, I assume you are looking at PatchCollection

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should change the flag for patch schema to also be -f and make it consistent throughout. The lens file can stay with -t or maybe -l would make more sense.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f and file is ambiguous for a command that takes two different files. I do not like that.

PatchSchema used to be f, but when the lens stuff got added it was changed to avoid this confusion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nasdf your opinion would be appreciated here when you have a chance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its fine to keep it -p for consistency with the other patch method. Using -f when there are multiple files is worse in my opinion.

@AndrewSisley AndrewSisley requested a review from nasdf March 13, 2024 20:21
Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments to resolve before approval :)

func MakeCollectionPatchCommand() *cobra.Command {
var patchFile string
var cmd = &cobra.Command{
Use: "patch [patch]",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I've notice that is some commands, the optional flags aren't shown in the Use field. I'm wondering if we should start showing all the optional flags here.

Comment on lines +126 to +127
// It will also update the GQL types used by the query system. It will error and not apply any of the
// requested, valid updates should the net result of the patch result in an invalid state. The
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The individual operations defined in the patch do not need to result in a valid state, only the net result of the full patch.

I though I understood the bahaviour from the prior sentence but this one gets me confused. How can the net result be valid if an individual operation creates an invalid state?

Comment on lines +796 to +797
// It is not enough to just match by the map index, in case the index does not pair
// up with the ID (this can happen if a user moves the collection within the map)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Can you explain how this can be possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{ "op": "move", "from": "/1", "path": "/2" }

This moves the collection within the map, without mutating the ID.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying.

Comment on lines +42 to +43
{ "op": "copy", "from": "/1/Name", "path": "/2/Name" },
{ "op": "remove", "path": "/1/Name" }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: Can you please add a test showing that if only a copy is applied, we will get a duplicated name error or something like that.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the value is low, as duplicate names are tested elsewhere, but I will add

  • add copy name test

{ "op": "replace", "path": "/2/ID", "value": 1 }
]
`,
ExpectedError: "collection sources cannot be mutated.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: The error is a little weird here. It's talking about mutating sources when the patch is dealing with the collection ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately when mutating IDs it becomes impossible for us to tell what the original object was (and that the ID was mutated), so it will often get caught by other validation rules.

Copy link
Member

@shahzadlone shahzadlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming you will resolve other comments

@AndrewSisley AndrewSisley merged commit c67bc56 into sourcenetwork:develop Mar 14, 2024
31 of 32 checks passed
@AndrewSisley AndrewSisley deleted the 2389-patch-col branch March 14, 2024 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/collections Related to the collections system feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add PatchCollection function to Store
4 participants