Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Improve change detector performance #1433

Merged

Conversation

AndrewSisley
Copy link
Contributor

@AndrewSisley AndrewSisley commented May 1, 2023

Relevant issue(s)

Resolves #1186
Partially resolves #1342

Description

Improves the change detector performance. Decreases runtime from 17 minutes to 5 minutes on my local machine. It also cleans up the paths used by the change detector (part of #1342).

It does this by changing how the setup-stage is handled, instead of calling go test once per test, it now only calls it once per test package - setting up a bunch of test db instances for the entire package.

This does mean that they can no longer rely on the use of test.TempDir, and instead the database instances live within the temp directory. This PR does not provide a mechanic to clean those up, but it should be quite easy to do so later. The total disk usage by a full test run is on 44MB (post run, during a run it is 2GB+44MB), so this is unlikely to be an issue.

I believe that more gains can be for not too much effort by allowing the change detector to run in parallel (package level, not each test). I believe this is currently this is only blocked by the git cloning of the latest target branch, and there are easy ways to remove that limitation (lower priority though): #1436.

Commits should be clean, most of the work is in the last commit Do change-detector setup once per package.

Todo immediately after merge:

  • Enable change detector workflow in github (was disabled manually, no workflow files to change)

@AndrewSisley AndrewSisley added perf Performance issue or suggestion ci/build This is issue is about the build or CI system, and the administration of it. action/no-benchmark Skips the action that runs the benchmark. area/cli Related to the CLI binary labels May 1, 2023
@AndrewSisley AndrewSisley added this to the DefraDB v0.5.1 milestone May 1, 2023
@AndrewSisley AndrewSisley requested a review from a team May 1, 2023 17:07
@AndrewSisley AndrewSisley self-assigned this May 1, 2023
@AndrewSisley
Copy link
Contributor Author

With changes:

✓ api/http (1.211s)
✓ cli (97ms)
✓ client (7ms)
∅ client/request
∅ cmd
∅ cmd/defradb
∅ cmd/genclidocs
∅ cmd/genmanpages
✓ config (111ms)
✓ connor (10ms)
∅ connor/numbers
∅ connor/time
✓ core (10ms)
✓ core/crdt (15ms)
∅ core/net
✓ datastore (52ms)
✓ datastore/badger/v3 (4.461s)
∅ datastore/iterable
✓ datastore/memory (42ms)
✓ db (187ms)
∅ db/base
∅ db/container
∅ db/fetcher
✓ errors (3ms)
✓ events (1.008s)
✓ logging (3.009s)
∅ merkle
✓ merkle/clock (17ms)
✓ merkle/crdt (17ms)
✓ metric (3ms)
∅ net
∅ net/api
∅ net/api/client
∅ net/api/pb
∅ net/pb
∅ net/utils
✓ node (103ms)
∅ planner
∅ planner/mapper
∅ request
∅ request/graphql
∅ request/graphql/parser
✓ request/graphql/schema (13ms)
∅ request/graphql/schema/types
∅ tests/bench
∅ tests/bench/collection (654ms)
∅ tests/bench/fixtures
∅ tests/bench/query/planner (629ms)
∅ tests/bench/query/simple (646ms)
∅ tests/bench/storage (651ms)
∅ tests/integration
✓ tests/integration/cli (36.824s)
∅ tests/integration/collection
✓ tests/integration/collection/create/one_to_many (1.605s)
✓ tests/integration/collection/create/one_to_one (1.575s)
✓ tests/integration/collection/update/one_to_one (1.605s)
✓ tests/integration/collection/update/simple (1.661s)
∅ tests/integration/events
✓ tests/integration/events/simple (2.432s)
∅ tests/integration/explain
✓ tests/integration/explain/default (10.221s)
✓ tests/integration/explain/execute (5.12s)
✓ tests/integration/explain/simple (1.778s)
∅ tests/integration/mutation/inline_array
✓ tests/integration/mutation/inline_array/update (2.685s)
✓ tests/integration/mutation/one_to_many/delete (1.815s)
∅ tests/integration/mutation/one_to_one
✓ tests/integration/mutation/one_to_one/create (2.109s)
✓ tests/integration/mutation/one_to_one/update (2.027s)
∅ tests/integration/mutation/relation
✓ tests/integration/mutation/relation/create (1.93s)
✓ tests/integration/mutation/relation/delete (2.533s)
∅ tests/integration/mutation/simple
✓ tests/integration/mutation/simple/create (2.287s)
✓ tests/integration/mutation/simple/delete (3.62s)
✓ tests/integration/mutation/simple/mix (2.312s)
✓ tests/integration/mutation/simple/special (1.802s)
✓ tests/integration/mutation/simple/update (2.652s)
✓ tests/integration/mutation/simple/update/special (1.807s)
✓ tests/integration/net/order (1.956s)
✓ tests/integration/net/state/one_to_many/peer (2.248s)
✓ tests/integration/net/state/one_to_many/replicator (1.669s)
✓ tests/integration/net/state/simple/peer (1.835s)
✓ tests/integration/net/state/simple/peer/subscribe (1.729s)
✓ tests/integration/net/state/simple/peer_replicator (1.737s)
✓ tests/integration/net/state/simple/replicator (1.757s)
✓ tests/integration/query/commits (6.858s)
✓ tests/integration/query/inline_array (8.595s)
✓ tests/integration/query/latest_commits (3.094s)
✓ tests/integration/query/one_to_many (7.394s)
✓ tests/integration/query/one_to_many_multiple (2.856s)
✓ tests/integration/query/one_to_many_to_many (1.857s)
✓ tests/integration/query/one_to_many_to_one (3.235s)
✓ tests/integration/query/one_to_one (3.452s)
✓ tests/integration/query/one_to_one_to_one (1.835s)
✓ tests/integration/query/one_to_two_many (2.325s)
✓ tests/integration/query/simple (19.481s)
✓ tests/integration/query/simple/with_filter (10.384s)
✓ tests/integration/schema (8.304s)
✓ tests/integration/schema/aggregates (3.465s)
✓ tests/integration/schema/updates/add (6.571s)
✓ tests/integration/schema/updates/add/field (7.103s)
✓ tests/integration/schema/updates/add/field/crdt (2.352s)
✓ tests/integration/schema/updates/add/field/kind (7.882s)
✓ tests/integration/schema/updates/copy (2.16s)
✓ tests/integration/schema/updates/copy/field (2.281s)
✓ tests/integration/schema/updates/move (1.913s)
✓ tests/integration/schema/updates/move/field (1.741s)
✓ tests/integration/schema/updates/remove (2.7s)
✓ tests/integration/schema/updates/remove/fields (2.655s)
✓ tests/integration/schema/updates/replace (2.043s)
✓ tests/integration/schema/updates/replace/field (1.916s)
✓ tests/integration/schema/updates/test (2.457s)
✓ tests/integration/schema/updates/test/field (2.1s)
✓ tests/integration/subscription (7.46s)
✓ version (46ms)

DONE 1366 tests, 75 skipped in 304.342s

real 5m4.353s
user 4m24.916s
sys 0m48.016s

@AndrewSisley
Copy link
Contributor Author

AndrewSisley commented May 1, 2023

Will all changes up until the main setup changes (i.e. similar to current develop)

env DEFRA_DETECT_DATABASE_CHANGES=true gotestsum -- ./... -shuffle=on -p 1
✓ api/http (1.202s)
✓ cli (100ms)
✓ client (8ms)
∅ client/request
∅ cmd
∅ cmd/defradb
∅ cmd/genclidocs
∅ cmd/genmanpages
✓ config (110ms)
✓ connor (10ms)
∅ connor/numbers
∅ connor/time
✓ core (10ms)
✓ core/crdt (15ms)
∅ core/net
✓ datastore (62ms)
✓ datastore/badger/v3 (4.496s)
∅ datastore/iterable
✓ datastore/memory (41ms)
✓ db (215ms)
∅ db/base
∅ db/container
∅ db/fetcher
✓ errors (3ms)
✓ events (1.008s)
✓ logging (2.943s)
∅ merkle
✓ merkle/clock (20ms)
✓ merkle/crdt (18ms)
✓ metric (2ms)
∅ net
∅ net/api
∅ net/api/client
∅ net/api/pb
∅ net/pb
∅ net/utils
✓ node (102ms)
∅ planner
∅ planner/mapper
∅ request
∅ request/graphql
∅ request/graphql/parser
✓ request/graphql/schema (20ms)
∅ request/graphql/schema/types
∅ tests/bench
∅ tests/bench/collection (670ms)
∅ tests/bench/fixtures
∅ tests/bench/query/planner (636ms)
∅ tests/bench/query/simple (666ms)
∅ tests/bench/storage (656ms)
∅ tests/integration
✓ tests/integration/cli (36.852s)
∅ tests/integration/collection
✓ tests/integration/collection/create/one_to_many (663ms)
✓ tests/integration/collection/create/one_to_one (676ms)
✓ tests/integration/collection/update/one_to_one (655ms)
✓ tests/integration/collection/update/simple (1.014s)
∅ tests/integration/events
✓ tests/integration/events/simple (1.155s)
∅ tests/integration/explain
✓ tests/integration/explain/default (1m16.884s)
✓ tests/integration/explain/execute (35.442s)
✓ tests/integration/explain/simple (1.784s)
∅ tests/integration/mutation/inline_array
✓ tests/integration/mutation/inline_array/update (10.287s)
✓ tests/integration/mutation/one_to_many/delete (1.785s)
∅ tests/integration/mutation/one_to_one
✓ tests/integration/mutation/one_to_one/create (4.144s)
✓ tests/integration/mutation/one_to_one/update (4.117s)
∅ tests/integration/mutation/relation
✓ tests/integration/mutation/relation/create (2.98s)
✓ tests/integration/mutation/relation/delete (8.847s)
∅ tests/integration/mutation/simple
✓ tests/integration/mutation/simple/create (6.507s)
✓ tests/integration/mutation/simple/delete (20.513s)
✓ tests/integration/mutation/simple/mix (6.46s)
✓ tests/integration/mutation/simple/special (1.783s)
✓ tests/integration/mutation/simple/update (9.217s)
✓ tests/integration/mutation/simple/update/special (1.822s)
✓ tests/integration/net/order (768ms)
✓ tests/integration/net/state/one_to_many/peer (645ms)
✓ tests/integration/net/state/one_to_many/replicator (647ms)
✓ tests/integration/net/state/simple/peer (658ms)
✓ tests/integration/net/state/simple/peer/subscribe (830ms)
✓ tests/integration/net/state/simple/peer_replicator (606ms)
✓ tests/integration/net/state/simple/replicator (679ms)
✓ tests/integration/query/commits (55.943s)
✓ tests/integration/query/inline_array (1m14.812s)
✓ tests/integration/query/latest_commits (11.998s)
✓ tests/integration/query/one_to_many (59.611s)
✓ tests/integration/query/one_to_many_multiple (12.329s)
✓ tests/integration/query/one_to_many_to_many (1.8s)
✓ tests/integration/query/one_to_many_to_one (15.899s)
✓ tests/integration/query/one_to_one (19.17s)
✓ tests/integration/query/one_to_one_to_one (1.775s)
✓ tests/integration/query/one_to_two_many (4.069s)
✓ tests/integration/query/simple (2m49.395s)
✓ tests/integration/query/simple/with_filter (1m33.347s)
✓ tests/integration/schema (36.888s)
✓ tests/integration/schema/aggregates (20.264s)
✓ tests/integration/schema/updates/add (7.292s)
✓ tests/integration/schema/updates/add/field (22.725s)
✓ tests/integration/schema/updates/add/field/crdt (7.613s)
✓ tests/integration/schema/updates/add/field/kind (1m2.844s)
✓ tests/integration/schema/updates/copy (1.907s)
✓ tests/integration/schema/updates/copy/field (6.514s)
✓ tests/integration/schema/updates/move (1.846s)
✓ tests/integration/schema/updates/move/field (1.769s)
✓ tests/integration/schema/updates/remove (6.784s)
✓ tests/integration/schema/updates/remove/fields (10.913s)
✓ tests/integration/schema/updates/replace (3.045s)
✓ tests/integration/schema/updates/replace/field (2.898s)
✓ tests/integration/schema/updates/test (6.682s)
✓ tests/integration/schema/updates/test/field (5.223s)
✓ tests/integration/subscription (7.583s)
✓ version (47ms)

DONE 1366 tests, 79 skipped in 1037.195s

real 17m17.207s
user 37m28.893s
sys 5m35.484s

@AndrewSisley AndrewSisley changed the title tests: Improve change detector performance test: Improve change detector performance May 1, 2023
@codecov
Copy link

codecov bot commented May 1, 2023

Codecov Report

Merging #1433 (7e7b00a) into develop (483a23a) will increase coverage by 0.07%.
The diff coverage is n/a.

❗ Current head 7e7b00a differs from pull request most recent head 06d445c. Consider uploading reports for the commit 06d445c to get more accurate results

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1433      +/-   ##
===========================================
+ Coverage    72.14%   72.21%   +0.07%     
===========================================
  Files          185      185              
  Lines        18160    18160              
===========================================
+ Hits         13101    13115      +14     
+ Misses        4024     4012      -12     
+ Partials      1035     1033       -2     

see 7 files with indirect coverage changes

Copy link
Member

@shahzadlone shahzadlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly questions and a nitpick comment, but are all non-blockers but would be nice to get replies to pre-merge.

thought/question: The performance is good! How much of that do you think was due to just not running explain tests? I am guessing running the explain tests wouldn't really slow it down by too much.

question: When explain tests are worked into the new framework, would change detection just work or would there need to be some edge cases needed to handle?

question: Did you get a change to check it works and actually also catches the breaking change? And once fixed the CI then passes?

@@ -64,10 +70,21 @@ func detectDbChangesInit(repository string, targetBranch string) {
return
}

tempDir := os.TempDir()
defraTempDir := path.Join(os.TempDir(), "defradb")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick:

Suggested change
defraTempDir := path.Join(os.TempDir(), "defradb")
defradbTempDir := path.Join(os.TempDir(), "defradb")

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty cool change Andy and a great improvement on run time.

However, I'm not a fan of merging something that will continuously pollute my temporary directory. I feel like it's a small enough PR that it would be worth handling the deletion of post-test files.

dbsDir := path.Join(changeDetectorTempDir, "dbs", fmt.Sprint(randNumber))

testPackagePath, isIntegrationTest := getTestPackagePath()
if !isIntegrationTest {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: When would that not be an integration test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bench suite uses this, and there might be other references too.

@AndrewSisley
Copy link
Contributor Author

However, I'm not a fan of merging something that will continuously pollute my temporary directory. I feel like it's a small enough PR that it would be worth handling the deletion of post-test files.

You are worried about 44MB per local run of the change detector, with all the pollution contained within a single directory with an obvious name, that gets auto-deleted on machine restart?

The delete stuff can be done immediately after if you are really concerned about it. I do not wish to think about it now.

@shahzadlone
Copy link
Member

@fredcarle brings a good point, and we discussed this over discord as well, would be nice to have temporary directory cleaned up in this PR (is not a hard blocker for me, if it is planned to do it soon after).

@fredcarle
Copy link
Collaborator

You are worried about 44MB per local run of the change detector, with all the pollution contained within a single directory with an obvious name, that gets auto-deleted on machine restart?

The delete stuff can be done immediately after if you are really concerned about it. I do not wish to think about it now.

I'm only giving my opinion. If you're doing it soon after, why not just do it in this PR?

I didn't request a change and you already have an approval so you can do what you think is best.

@AndrewSisley
Copy link
Contributor Author

thought/question: The performance is good! How much of that do you think was due to just not running explain tests? I am guessing running the explain tests wouldn't really slow it down by too much.

None in the posted times, they were run after Explain tests had been skipped, not before. I have not bothered to check the runtime difference against develop before they were skipped.

question: When explain tests are worked into the new framework, would change detection just work or would there need to be some edge cases needed to handle?

They should just work :)

question: Did you get a change to check it works and actually also catches the breaking change?

Yes :)

And once fixed the CI then passes?

I dont understand this question

@AndrewSisley
Copy link
Contributor Author

If you're doing it soon after, why not just do it in this PR?

I didn't say I'd do that, but it can be done so if we feel that way.

defradb is better/more consistent with other stuff
'code/hash' feels more readable than dumping a bunch of hashes directly into the parent with no explanation
Will be used in init soon too
Plan is to move explain to the new test framework.  They also have very limited reason to be covered by the change detector atm as they do really get affected by persisted data changes (minus the obvious that would be picked up anyway. And The func call removed is going to be reworked/split and I do not wish to put any serious effort in to maintaining this code path (for the reasons stated).
@AndrewSisley AndrewSisley merged commit 68a8cb6 into sourcenetwork:develop May 1, 2023
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
## Relevant issue(s)

Resolves sourcenetwork#1186
Partially resolves sourcenetwork#1342

## Description

Improves the change detector performance. Decreases runtime from 17
minutes to 5 minutes on my local machine. It also cleans up the paths
used by the change detector (part of
sourcenetwork#1342).

It does this by changing how the setup-stage is handled, instead of
calling `go test` once per test, it now only calls it once per test
package - setting up a bunch of test db instances for the entire
package.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/no-benchmark Skips the action that runs the benchmark. area/cli Related to the CLI binary ci/build This is issue is about the build or CI system, and the administration of it. perf Performance issue or suggestion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sort the pollution under /tmp Change detector setup stage performance significantly degraded
3 participants