-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Improve change detector performance #1433
test: Improve change detector performance #1433
Conversation
With changes: ✓ api/http (1.211s) DONE 1366 tests, 75 skipped in 304.342s real 5m4.353s |
Will all changes up until the main setup changes (i.e. similar to current develop) env DEFRA_DETECT_DATABASE_CHANGES=true gotestsum -- ./... -shuffle=on -p 1 DONE 1366 tests, 79 skipped in 1037.195s real 17m17.207s |
a8535f9
to
e157369
Compare
Codecov Report
@@ Coverage Diff @@
## develop #1433 +/- ##
===========================================
+ Coverage 72.14% 72.21% +0.07%
===========================================
Files 185 185
Lines 18160 18160
===========================================
+ Hits 13101 13115 +14
+ Misses 4024 4012 -12
+ Partials 1035 1033 -2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly questions and a nitpick comment, but are all non-blockers but would be nice to get replies to pre-merge.
thought/question: The performance is good! How much of that do you think was due to just not running explain tests? I am guessing running the explain tests wouldn't really slow it down by too much.
question: When explain tests are worked into the new framework, would change detection just work or would there need to be some edge cases needed to handle?
question: Did you get a change to check it works and actually also catches the breaking change? And once fixed the CI then passes?
@@ -64,10 +70,21 @@ func detectDbChangesInit(repository string, targetBranch string) { | |||
return | |||
} | |||
|
|||
tempDir := os.TempDir() | |||
defraTempDir := path.Join(os.TempDir(), "defradb") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick:
defraTempDir := path.Join(os.TempDir(), "defradb") | |
defradbTempDir := path.Join(os.TempDir(), "defradb") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty cool change Andy and a great improvement on run time.
However, I'm not a fan of merging something that will continuously pollute my temporary directory. I feel like it's a small enough PR that it would be worth handling the deletion of post-test files.
dbsDir := path.Join(changeDetectorTempDir, "dbs", fmt.Sprint(randNumber)) | ||
|
||
testPackagePath, isIntegrationTest := getTestPackagePath() | ||
if !isIntegrationTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: When would that not be an integration test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bench suite uses this, and there might be other references too.
You are worried about 44MB per local run of the change detector, with all the pollution contained within a single directory with an obvious name, that gets auto-deleted on machine restart? The delete stuff can be done immediately after if you are really concerned about it. I do not wish to think about it now. |
@fredcarle brings a good point, and we discussed this over discord as well, would be nice to have temporary directory cleaned up in this PR (is not a hard blocker for me, if it is planned to do it soon after). |
I'm only giving my opinion. If you're doing it soon after, why not just do it in this PR? I didn't request a change and you already have an approval so you can do what you think is best. |
None in the posted times, they were run after Explain tests had been skipped, not before. I have not bothered to check the runtime difference against develop before they were skipped.
They should just work :)
Yes :)
I dont understand this question |
I didn't say I'd do that, but it can be done so if we feel that way. |
7e7b00a
to
a327c0d
Compare
defradb is better/more consistent with other stuff
'code/hash' feels more readable than dumping a bunch of hashes directly into the parent with no explanation
Will be used in init soon too
Plan is to move explain to the new test framework. They also have very limited reason to be covered by the change detector atm as they do really get affected by persisted data changes (minus the obvious that would be picked up anyway. And The func call removed is going to be reworked/split and I do not wish to put any serious effort in to maintaining this code path (for the reasons stated).
Instead of once per test
a327c0d
to
06d445c
Compare
## Relevant issue(s) Resolves sourcenetwork#1186 Partially resolves sourcenetwork#1342 ## Description Improves the change detector performance. Decreases runtime from 17 minutes to 5 minutes on my local machine. It also cleans up the paths used by the change detector (part of sourcenetwork#1342). It does this by changing how the setup-stage is handled, instead of calling `go test` once per test, it now only calls it once per test package - setting up a bunch of test db instances for the entire package.
Relevant issue(s)
Resolves #1186
Partially resolves #1342
Description
Improves the change detector performance. Decreases runtime from 17 minutes to 5 minutes on my local machine. It also cleans up the paths used by the change detector (part of #1342).
It does this by changing how the setup-stage is handled, instead of calling
go test
once per test, it now only calls it once per test package - setting up a bunch of test db instances for the entire package.This does mean that they can no longer rely on the use of
test.TempDir
, and instead the database instances live within the temp directory. This PR does not provide a mechanic to clean those up, but it should be quite easy to do so later. The total disk usage by a full test run is on 44MB (post run, during a run it is 2GB+44MB), so this is unlikely to be an issue.I believe that more gains can be for not too much effort by allowing the change detector to run in parallel (package level, not each test). I believe this is currently this is only blocked by the git cloning of the latest target branch, and there are easy ways to remove that limitation (lower priority though): #1436.
Commits should be clean, most of the work is in the last commit
Do change-detector setup once per package
.Todo immediately after merge: