Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Add new setup for testing explain functionality #949

Merged
merged 7 commits into from
Nov 27, 2022

Conversation

shahzadlone
Copy link
Member

@shahzadlone shahzadlone commented Nov 18, 2022

Relevant issue(s)

Resolves #521
Blocks #948 and #326

Description

This PR adds the testing setup for the explain feature to be more flexible and easier to manage going forward.

This new setup will handle all explain types: simple, debug, execute, predict

The PR introduces three different types of test cases for asserting the results from an explain request. These can be asserted in one test case or smaller separate cases:

  1. Being able to assert full explain graph or just quickly see the full graph to debug (like before).
  2. Only matching the plan node ordering.
  3. Targeting specific attributes only (option to also select their children nodes, or just it's attributes)

For Reviewers

As this is a large PR here are some tips for reviewers.

Would strongly suggest reviewing commit by commit

  • 1st commit is just a Makefile rule that can be omitted.
  • 2nd commit is just moving the explain tests into the new explain directory and adhering the common schema (and some whitespace fixes for consistent styling across all explain tests). This commit can be ignored.

The main commits to review would be these commits (in this order):

This commit contains concrete example tests that are all passing (for demonstration purposes):

Need help deciding:

  1. Should the conversion of old to new tests happen in this same PR or separately? Would prefer separate. If separate I would like to leave the example tests in as the are already asserting the functionality, and drop these in the next PR ones all explain tests are converted.
    Resolved: Will do the conversion in next PR, will keep the tests and not drop the commit that was to be dropped.

  2. The ExpectedTarget functionality was implemented assuming ordered mapping is maintained (I blame it on my C++ past haha), but since go's maps are random ordered, the OccurancesToSkip feature that helped select a target without a path order would need a rethink (perhaps conversion to sorted keys, i.e. ordered map for target matching).
    Resolved: I did a trick here, rather than traversing the map normally (which will make no guarantees on the ordering, i.e. random), I collected all the keys of the map and sorted them, then traversed and asserted the map in that sorted order.
    Note: This deterministic assertion case is only important for the ExpectedTargets as we need it to consistently find the exact target whose attributes we want to assert.

Tasks

  • I made sure the code is well commented, particularly hard-to-understand areas.
  • I made sure the repository-held documentation is changed accordingly.
  • I made sure the pull request title adheres to the conventional commit style (the subset used in the project can be found in tools/configs/chglog/config.yml).
  • I made sure to discuss its limitations such as threats to validity, vulnerability to mistake and misuse, robustness to invalidation of assumptions, resource requirements, ...

How has this been tested?

Locally and CI

Specify the platform(s) on which this was tested:

  • Arch Linux

@shahzadlone shahzadlone added area/testing Related to any test or testing suite code quality Related to improving code quality area/planner Related to the planner system labels Nov 18, 2022
@shahzadlone shahzadlone added this to the DefraDB v0.4 milestone Nov 18, 2022
@shahzadlone shahzadlone self-assigned this Nov 18, 2022
@shahzadlone shahzadlone changed the title Lone/test/refactor explain tests test: Refactor of explain tests Nov 18, 2022
@shahzadlone shahzadlone force-pushed the lone/test/refactor-explain-tests branch 2 times, most recently from 8155d46 to 36da36a Compare November 22, 2022 13:44
@shahzadlone shahzadlone requested a review from a team November 22, 2022 14:01
@fredcarle
Copy link
Collaborator

The implementation seems logical to me. @AndrewSisley will probably have a stronger opinion than I do on this one so you might want to wait for his feedback.

I did spot a few things that I'll review properly once this goes out of draft but not implementation related.

Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the last commit (Add the explain tests refactor setup.), I like it and added some localized todos. Would be really handy if you amended a couple of example tests into this commit though, as I'm having to guess/imagine what they might look like.

tests/integration/explain/utils.go Outdated Show resolved Hide resolved
tests/integration/explain/utils.go Outdated Show resolved Hide resolved
tests/integration/explain/utils.go Outdated Show resolved Hide resolved
tests/integration/explain/utils.go Outdated Show resolved Hide resolved
@shahzadlone
Copy link
Member Author

I reviewed the last commit (Add the explain tests refactor setup.), I like it and added some localized todos. Would be really handy if you amended a couple of example tests into this commit though, as I'm having to guess/imagine what they might look like.

Thanks for the feedback! I do have an example test that contains the old way and the new way for a side-by-side comparison (in the second last commit).

File: tests/integration/explain/simple/old_and_new_example_test.go
Commit: 5764d7b

@AndrewSisley
Copy link
Contributor

AndrewSisley commented Nov 22, 2022

Thanks for the feedback! I do have an example test that contains the old way and the new way for a side-by-side comparison (in the second last commit).

😆 need to shuffle those commits around lol - the example (commit) makes use of stuff contained in a later commit 😁 I didn't think to look there

Update: I looked at the example test - looks really nice :)

@shahzadlone
Copy link
Member Author

Thanks for the feedback! I do have an example test that contains the old way and the new way for a side-by-side comparison (in the second last commit).

😆 need to shuffle those commits around lol - the example (commit) makes use of stuff contained in a later commit 😁 I didn't think to look there

Haha yea will do, sorry about the confusion.

@shahzadlone shahzadlone force-pushed the lone/test/refactor-explain-tests branch 3 times, most recently from d55babf to c673038 Compare November 24, 2022 09:27
@shahzadlone shahzadlone changed the title test: Refactor of explain tests test: Add new setup for testing explain functionality Nov 24, 2022
@shahzadlone shahzadlone force-pushed the lone/test/refactor-explain-tests branch from c673038 to 6b809a7 Compare November 24, 2022 09:37
@shahzadlone shahzadlone added the action/no-benchmark Skips the action that runs the benchmark. label Nov 24, 2022
@shahzadlone shahzadlone force-pushed the lone/test/refactor-explain-tests branch 2 times, most recently from aa53987 to e5afb14 Compare November 24, 2022 10:47
I found myself wanting to quickly see if my changes to the test files
still compiles the tests. Hence this rule.
Also fix all the tests whose keys and other attributes changed due to
having a different schema (all mutation explain tests were moved to use the
previous schema used by query explain tests.).
This commit makes some changes that would let us reuse some code that
will also be needed for the new explain setup. As the explain testing
setup will build on top of this, I wanted this to be a separate commit.
Hopefully not too bad to review.
@shahzadlone shahzadlone force-pushed the lone/test/refactor-explain-tests branch from e5afb14 to f91840f Compare November 24, 2022 12:37
Example to see the old and new testing sie by side.

Note: This commit will be dropped before merge, it is there only to
help visualize the usage of the new explain test setup.
@shahzadlone shahzadlone force-pushed the lone/test/refactor-explain-tests branch from f91840f to f0cd96f Compare November 24, 2022 17:03
@shahzadlone shahzadlone marked this pull request as ready for review November 24, 2022 17:32
@shahzadlone shahzadlone requested review from AndrewSisley and a team November 24, 2022 17:32
Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really nice Shahzad - I really like the design and am curious as to how it will feel when writing tests :) Added a couple of minor/localized comments that would be good to sort out before merge.

RE: Should the conversion of old to new tests happen in this same PR or separately?
I think separately works well, is already a lot of work in this PR and stuff will get missed if it continues to grow (plus any new tests written in the meantime wont have access to the new stuff and will just add more work to this PR)

RE: The ExpectedTarget functionality was implemented assuming ordered mapping is maintained
😁 I'd made that error a few times lol - I thought the production explain stuff was ordered though? I'd suggest making the actual production explain result deterministic if it is not.

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice change that helps with separation of concerns. I like that.

I was looking at the reference issue and I fail to understand how being able to assert only the node or the contents of the nodes or the full graph helps with the reason behind the issue. The only way to prevent having to change the explain test when doing changes would be to only use the node ordering match. The full graph or the node content tests would have to be changed. Does this mean that most of the time we will only want to test for node ordering and nothing else?

As for your questions:
1- In a separate PR is probably better since this is already quite large. Specially in case you come across conflicts
2- Not sure I understand the question. Are you saying the OccurrencesToSkip doesn't work at the moment? From what I can tell, the skip feature you've implemented skips to a child and not to a sibling so the map ordering doesn't apply here. Can you give more details so I can understand better? :)

@@ -27,6 +27,8 @@ import (
ds "github.com/ipfs/go-datastore"
"github.com/stretchr/testify/assert"

"github.com/sourcenetwork/immutable"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: Simply to follow a common approach, I'm wondering if we should join this with the block of external packages (as I did when integrating it) or should external sourcenetwork packages have their own block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any preference, but this seems too tedious to track long-term without a lint rule.

  • Will do for now

tests/integration/explain/simple/utils.go Outdated Show resolved Hide resolved
tests/integration/explain/utils.go Outdated Show resolved Hide resolved
tests/integration/explain/utils.go Outdated Show resolved Hide resolved
tests/integration/explain/utils.go Outdated Show resolved Hide resolved
tests/integration/explain/utils.go Show resolved Hide resolved
tests/integration/explain/utils.go Outdated Show resolved Hide resolved
tests/integration/explain/utils.go Show resolved Hide resolved
switch r := actualResult.(type) {
case map[string]any:
for key, value := range r {
if isPlanNode(key) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Since there is nothing after the if, I suggest you make it more obvious that if it's not a plan node, we just go to the next result

if !isPlanNode(key) {
  continue
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a personal preference thing. IMO is simpler and easier to follow the flow without a continue label. Leaving as is.

actualResults []map[string]any,
) {
for _, actualResult := range actualResults {
foundActualTarget, _, isFound := findTargetNode(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Rename to foundTarget

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever it comes to asserting tests, the most prominent terms I have seen being used to compare results are Expect vs Actual.

  • The word 'Actual' signals to me that this is the result that is returned in reality.
  • The word 'Expect' signals to me that this is an assumed result.

The term Result or Target alone IMO can be ambiguous and can mean either expected or actual results. If writing better names like these can help another developer (or myself in the future) to quickly distinguish which is which, I think it is worth the long name.

Leaving all suggestions of renaming by removal of the word Actual as is.

For example:
- Fail if given an empty request for a non-transactional test.

One additional thing I crammed into this commit was to handle a list of
map to also be copied while copying a map.
@shahzadlone shahzadlone force-pushed the lone/test/refactor-explain-tests branch from 35d61a7 to eb51055 Compare November 25, 2022 13:47
Also changes the example tests to ensure the order works, and
sort the keys in sorted order of map (for more visual clarity).
@shahzadlone
Copy link
Member Author

shahzadlone commented Nov 25, 2022

I thought the production explain stuff was ordered though? I'd suggest making the actual production explain result deterministic if it is not. - @AndrewSisley

The moment you have map in go, the order of the results is random. For example, results from all (explain or non-explain) request types will only have the topmost level result (the immediate maps under the []) ordered as it's a []map[string]any but the fields within the map[string]any are not going to be ordered.

I do agree though, that these should be deterministic, but my hunch is this is not going to be a clean solution to do in go.

So for my problem:
I did a trick here, rather than traversing the map normally (which will make no guarantees on the ordering, i.e. random), I collected all the keys of the map and sorted them, then traversed and asserted the map in that sorted order. Which seems to have solved the problem. This is done in the commit: Fix - Traverse map for targets in sorted order

@fredcarle
Copy link
Collaborator

So for my problem: I did a trick here, rather than traversing the map normally (which will make no guarantees on the ordering, i.e. random), I collected all the keys of the map and sorted them, then traversed and asserted the map in that sorted order. Which seems to have solved the problem. This is done in the commit: Fix - Traverse map for targets in sorted order

I like that solution :)

@shahzadlone
Copy link
Member Author

@fredcarle

I was looking at the reference issue and I fail to understand how being able to assert only the node or the contents of the nodes or the full graph helps with the reason behind the issue. The only way to prevent having to change the explain test when doing changes would be to only use the node ordering match. The full graph or the node content tests would have to be changed. Does this mean that most of the time we will only want to test for node ordering and nothing else?

There were a few problems with the current setup:

  1. Had to assert the full result of the explain graph, some graphs were insane to maintain (for example this was removed: https://github.com/sourcenetwork/defradb/blob/lone/chore/maintain-explain-tests-that-were-deleted/tests/integration/query/explain/mix_test.go)
  2. We couldn't target a specific node and only compare its attributes.
  3. If plan graph was ever tweaked we would need to now change all explain tests (for example there was a commit that removed renderNode and since every test has the full graph with attributes and the full ordering now we need to change all of them (hectic to maintain)
  4. Couldn't isolate ordering of nodes vs localized assertions of only attributes (because of 2 and 3).

Why we need ExpectedTargets

One other solution to solve the problem of only targeting and asserting attributes of a node would have been to specify the full path upto that target, instead of the current approach of specifying how many occurrences to skip of a similar matching node. For example a test to get the first scanNode can look like this:

"explain": dataMap{
   "selectTopNode": dataMap{
   	"sumNode": dataMap{
   		"selectNode": dataMap{
   			"parallelNode": []dataMap{
   				{
   					"typeIndexJoin": dataMap{
   						"root": dataMap{
   							"scanNode": dataMap{},
   						},
   					},
   				},
   			},
   		},
   	},
   },
},

However the problem with this solution would have been that now we are again dependent on the pattern and ordering of the nodes, and if in future say selectTopNode was removed, now you would have to change all the tests that target a specific attribute which used this style. In contrast the implementation in this PR would not need to be changed as the assertion does not depend on the full path / pattern.

Why we need ExpectedPatterns

Even though we didn't want to assert the whole graph all the time, we still want to ensure that the ordering of the plan nodes (i.e. pattern) is maintained, so this helps assert that.

Why we need ExpectedFullGraph

If ever want to assert the full graph, or debug the full graph while writing the tests this is very handy.

Are you saying the OccurrencesToSkip doesn't work at the moment?

It works but for target nodes that exist across say multiple parallelNodes and can also be under another sibling node, and due to the map being unordered it wasn't behaving properly for that specific case all the time (it would pass 50% of the time). This is now I think fixed with the commit I added with traverses and asserts in a sorted order of the map (even though map in unordered, seems to do the trick).

From what I can tell, the skip feature you've implemented skips to a child and not to a sibling so the map ordering doesn't apply here. Can you give more details so I can understand better? :)

Incorrect, the previous version did also check the siblings and so does the current implementation, which is why ordering was an issue. You can look at the example in the file example_new_join_test.go should make it clear (I marked the nodes that I am targeting with each OccurancesToSkip argument, if you scroll all the way to the bottom). Also can look at the commit: pr: Fix - Traverse map for targets in sorted order for more insight.

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@fredcarle
Copy link
Collaborator

Overall a very nice improvement. Good job :)

@shahzadlone shahzadlone merged commit 5d98a17 into develop Nov 27, 2022
@shahzadlone shahzadlone deleted the lone/test/refactor-explain-tests branch November 27, 2022 15:56
shahzadlone added a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
)

- Resolves sourcenetwork#521

- Description:
This PR adds the testing setup for the `explain` feature to be more flexible and easier to manage going forward.

This new setup will handle all explain types: `simple`, `debug`, `execute`, `predict`.

The PR introduces three different types of test cases for asserting the results from an `explain` request. These can be asserted in one test case or smaller separate cases:
1) Being able to assert full explain graph or just quickly see the full graph to debug (like before).
2) Only matching the plan node ordering.
3) Targeting specific attributes only (option to also select their children nodes, or just it's attributes)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/no-benchmark Skips the action that runs the benchmark. area/planner Related to the planner system area/testing Related to any test or testing suite code quality Related to improving code quality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Explain-related tests
3 participants