Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: reduce runtime #20688

Closed
wants to merge 2 commits into from
Closed

tests: reduce runtime #20688

wants to merge 2 commits into from

Conversation

BridgeAR
Copy link
Member

This refactors some tests to reduce the runtime of those.

I looked at the first five tests from #20128 but the other two did not seem to be
possible to improve (CPU bound + a necessary timeout of 1 second).

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the test Issues and PRs related to the tests. label May 12, 2018
@BridgeAR
Copy link
Member Author

@@ -1,21 +1,22 @@
'use strict';

require('../common');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than move the module use to a later line, thus making this test violate our test formatting guidelines (and possibly make it inevitable that someone will just move it back some day soon), it would be better IMO to not call this file with spawnSync at all but instead to move the recursive async call stuff to a fixture. Is there a reason not to do that instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on this... the require('../common') should be at the top.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loading common is a immense overhead and when looking into that file there are only two reasons for it to always load currently:

  1. On exit it checks if there are "leaked globals". I do not understand the reasoning for these checks. What is the point in checking for these? If anyone adds such a thing it should probably come up in a code review.
  2. If the NODE_TEST_WITH_ASYNC_HOOKS environment variable is set, it will check for some things on exit. The reasoning for this was probably a rough smoke test. I personally believe we should not use it like this and instead just try to add a solid test base that would cover these things already. Do we know if this has ever uncovered some bugs?

Besides that common only exports functions for convenience.

So what I actually want to propose is to remove all common requires if they are not necessary and I would also like to split common into a couple small files that are specific to the individual use case. This will hopefully also improve the test runtime in general.

Copy link
Member

@Trott Trott May 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for leaked globals should remain. It may be missed in code review and people should be notified when they run make test rather than having it come up during code review.

I'd be fine if everything else were moved to individual modules under common but that's a big change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why would we manipulate globals in a test if there is no reason to do so?

Having a general check against leaked globals is absolutely fine but we could achieve that by just requiring all modules in a test and then checking for the known globals.

Copy link
Member

@Trott Trott May 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BridgeAR You make a reasonable case for removing the requirement of always loading common. I'm not sure I'm convinced--for example, if the current global leak detection approach does not work sufficiently well, might not the better approach be to fix it rather than to not require it? I'm asking rhetorically, as I can see arguments both ways on that one.

If no one else objects to it, I won't either.

That all aside: if we want to run some code in all our tests files all the time (no matter what code), should we not just rewrite the way how we call our tests and make sure the tests automatically load that code as well?

That had been suggested in the past and, IIRC, got a big -1 from people (I think @bnoordhuis was one of them) because it introduces more magic. In other words, tests will pass when called with test.py or whatever but fail if run directly with ./node. For me personally, I'm not sure that's really a problem. It might even be an improvement from our current situation if it changes us from "some tests fail if you don't use test.py" to "all tests fail if you don't use test.py". But I don't feel strongly about that.

@nodejs/testing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, thinking more about it, I'm not 100% sure the case that it doesn't work correctly is convincing. So what if it doesn't check non-enumerables? That just means we fail to detect it if someone adds a non-enumerable. First, that can be fixed. But second, in most cases, I'd imagine people accidentally leak enumerables. And those get detected. Or am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's only taken me four minutes, but I think I've returned to thinking that the global leak detection is valuable and should be in all tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it still seems weird to have a halfway broken test that runs with each test file. Yes, it does detect enumerable entries but that should also be possible in almost all cases by just having a single test file that loads all modules. Of course the global could be added somewhere deep down but that is even more unlikely... I also do not know a compelling reason for not manipulating globals in tests. We normally do not do that but if it is done, it would only impact that specific test. For me removing common is more about code hygiene than about runtime.

It feels like this is a legacy test that was implemented because 8 years ago the test suite was not that good and people pushed code directly to the repo instead of having code reviews. All that changed and as such this test became somewhat obsolete out of my perspective.

I do not want to spend much more energy into this but I really think we should start questioning some old decisions way more often.

Copy link
Member

@Trott Trott May 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…I really think we should start questioning some old decisions way more often.

💯 to that. I think that's what we're doing here right now. We're just not (yet, at least) agreeing on what the correct answer to that questioning is. And you raise excellent points. I'm not convinced it outweighs the benefits, but it would be interesting to know what others think. AFACT, it's just you, me, @jasnell, and @TimothyGu that have weighed on in the specific issue. Might be interesting to see what others think. This is a significant enough issue that I can see reconstituting the Testing WG if there are one or two other similar issues like this that need to be hashed out.

@@ -1,17 +1,17 @@
'use strict';
const common = require('../common');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Can we move the child process code to a fixture?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think moving these things to a fixture is the wrong thing to do. We have a lot of fixtures and there is no benefit this in a fixture as far as I see it. It is much easier to handle this test if everything is together. The fixtures folder is full with things of which probably no one has an idea about what is really in there.

@mscdex
Copy link
Contributor

mscdex commented May 12, 2018

I think the commit message prefix should be just test:?

Copy link
Member

@TimothyGu TimothyGu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 on changing the location of require('../').

@BridgeAR
Copy link
Member Author

@TimothyGu would you be so kind and give some reasoning for it? I just fail to understand what the point of always loading it is as I pointed out above.

@Trott
Copy link
Member

Trott commented May 13, 2018

@TimothyGu would you be so kind and give some reasoning for it? I just fail to understand what the point of always loading it is as I pointed out above.

@BridgeAR The purpose of this change seems to be to shave a second or two off the test run. That seems like an insignificant gain compared to the cost of introducing inconsistency in test formatting and practices, especially given that there is another approach that should be as effective (which is to put the code that doesn't need common into fixtures). I agree that fixtures is a mess, but in my opinion, the answer isn't "don't use fixtures". The answer is "organize the fixtures".

@BridgeAR
Copy link
Member Author

@Trott for me it the current discussion is more about the general purpose of always loading common and not about shaving of a second of the test run. This would not only make the tests faster but also make the code cleaner. It just does not seem necessary to do that because the global check never really worked properly. See #20688 (comment) for more about that.

About organizing the fixtures: at least for me the main aspect is that I do not have to check another file if I want to know what the test actually does. I strongly believe it is best to keep that part in the same file.

This refactors some tests to reduce the runtime of those.

Refs: nodejs#20128
@BridgeAR
Copy link
Member Author

Rebased to address the tests comment. I also moved the common part back to the top of the files.

New CI https://ci.nodejs.org/job/node-test-pull-request/14859/

const ret = spawnSync(process.execPath, [__filename, 'async']);
const ret = spawnSync(
process.execPath,
['--stack_size=50', __filename, 'async']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren’t we never supposed to change the stack size?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this lowers the size that V8 assumes to have, it should be fine. This is also the only main reason for this test to run ~3x faster. I personally think it is fine to use here because it would still work even if the flag would be a no-op. @hashseed @bmeurer are you fine with me using this in this case?

assert.strictEqual(ret.status, 0);
assert.ok(!/async.*hook/i.test(ret.stderr.toString('utf8', 0, 1024)));
const stderr = ret.stderr.toString('utf8', 0, 2048);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the increase in end?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the actual output and it was cut off relatively early without printing a lot of information. By increasing that limit it actually makes sure we really test for the right values.

@BridgeAR
Copy link
Member Author

PTAL

Copy link
Member

@hashseed hashseed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with using --stack_size for the purpose of making tests hit the stack limit sooner.

@BridgeAR
Copy link
Member Author

Ping @TimothyGu

Trott
Trott previously requested changes May 19, 2018
Copy link
Member

@Trott Trott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes introduce unreliability into the tests that is not in the current versions. Running these tests with -j 96 --repeat 192 (or other large-ish values) results in errors that do not happen with the current versions.

@Trott
Copy link
Member

Trott commented May 19, 2018

Changing spawnSync() to spawn() may have unintended consequences for test reliability. Especially if this is going to happen in more and more tests, the solution is likely to move such tests to sequential, defeating the purpose of these changes in the first place.

$ tools/test.py -j 40 --repeat 40 test/parallel/test-buffer-constructor-node-modules-paths.js
=== release test-buffer-constructor-node-modules-paths ===                     
Path: parallel/test-buffer-constructor-node-modules-paths
/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20
  child.stderr.on('data', (value) => {
               ^

TypeError: Cannot read property 'on' of undefined
    at test (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20:16)
    at Object.<anonymous> (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:44:1)
    at Module._compile (internal/modules/cjs/loader.js:702:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:713:10)
    at Module.load (internal/modules/cjs/loader.js:612:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:551:12)
    at Function.Module._load (internal/modules/cjs/loader.js:543:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:744:10)
    at startup (internal/bootstrap/node.js:236:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:568:3)
Command: out/Release/node /Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js
=== release test-buffer-constructor-node-modules-paths ===                    
Path: parallel/test-buffer-constructor-node-modules-paths
/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20
  child.stderr.on('data', (value) => {
               ^

TypeError: Cannot read property 'on' of undefined
    at test (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20:16)
    at Object.<anonymous> (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:35:1)
    at Module._compile (internal/modules/cjs/loader.js:702:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:713:10)
    at Module.load (internal/modules/cjs/loader.js:612:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:551:12)
    at Function.Module._load (internal/modules/cjs/loader.js:543:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:744:10)
    at startup (internal/bootstrap/node.js:236:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:568:3)
Command: out/Release/node /Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js
=== release test-buffer-constructor-node-modules-paths ===                    
Path: parallel/test-buffer-constructor-node-modules-paths
/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20
  child.stderr.on('data', (value) => {
               ^

TypeError: Cannot read property 'on' of undefined
    at test (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20:16)
    at Object.<anonymous> (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:45:1)
    at Module._compile (internal/modules/cjs/loader.js:702:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:713:10)
    at Module.load (internal/modules/cjs/loader.js:612:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:551:12)
    at Function.Module._load (internal/modules/cjs/loader.js:543:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:744:10)
    at startup (internal/bootstrap/node.js:236:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:568:3)
Command: out/Release/node /Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js
=== release test-buffer-constructor-node-modules-paths ===                    
Path: parallel/test-buffer-constructor-node-modules-paths
/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20
  child.stderr.on('data', (value) => {
               ^

TypeError: Cannot read property 'on' of undefined
    at test (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20:16)
    at Object.<anonymous> (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:36:1)
    at Module._compile (internal/modules/cjs/loader.js:702:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:713:10)
    at Module.load (internal/modules/cjs/loader.js:612:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:551:12)
    at Function.Module._load (internal/modules/cjs/loader.js:543:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:744:10)
    at startup (internal/bootstrap/node.js:236:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:568:3)
Command: out/Release/node /Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js
=== release test-buffer-constructor-node-modules-paths ===                    
Path: parallel/test-buffer-constructor-node-modules-paths
/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20
  child.stderr.on('data', (value) => {
               ^

TypeError: Cannot read property 'on' of undefined
    at test (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20:16)
    at Object.<anonymous> (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:41:1)
    at Module._compile (internal/modules/cjs/loader.js:702:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:713:10)
    at Module.load (internal/modules/cjs/loader.js:612:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:551:12)
    at Function.Module._load (internal/modules/cjs/loader.js:543:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:744:10)
    at startup (internal/bootstrap/node.js:236:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:568:3)
Command: out/Release/node /Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js
=== release test-buffer-constructor-node-modules-paths ===                    
Path: parallel/test-buffer-constructor-node-modules-paths
/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20
  child.stderr.on('data', (value) => {
               ^

TypeError: Cannot read property 'on' of undefined
    at test (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:20:16)
    at Object.<anonymous> (/Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js:34:1)
    at Module._compile (internal/modules/cjs/loader.js:702:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:713:10)
    at Module.load (internal/modules/cjs/loader.js:612:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:551:12)
    at Function.Module._load (internal/modules/cjs/loader.js:543:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:744:10)
    at startup (internal/bootstrap/node.js:236:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:568:3)
Command: out/Release/node /Users/trott/io.js/test/parallel/test-buffer-constructor-node-modules-paths.js
[00:15|% 100|+  34|-   6]: Done
$

assert.strictEqual(stderr.trim(), '');
let stderr = '';
child.stderr.on('data', (value) => {
stderr += value.toString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please always use stream.setEncoding() instead of converting each chunk to strings individually.

@addaleax
Copy link
Member

@Trott Do you know why those failures happen? It seems like maybe we should consider that a bug in Node.js?

@Trott
Copy link
Member

Trott commented May 19, 2018

@Trott Do you know why those failures happen? It seems like maybe we should consider that a bug in Node.js?

@addaleax I didn't look closely and I'm about to step away from the computer, but I'm guessing that the change from spawnSync() to spawn() resulted in too many processes trying to run so the spawn failed and child.stderr doesn't exist. Maybe the new code is not listening for an error event that it should be listening for or something like that.

@Trott
Copy link
Member

Trott commented May 19, 2018

@addaleax Yeah, looking at the change for an additional 30 seconds (which I probably should have done before typing up my previous response): This changes the test from launching 13 processes in series to launching 13 processes in parallel. IMO, tests like that should not go in parallel. If we want to do it like this, the test should be moved to sequential, but that probably negates most or all of the runtime reduction, which is the main motivation here judging from the title of the PR ("reduce runtime").

Copy link
Member

@TimothyGu TimothyGu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM when @Trott's concerns get addressed.

@BridgeAR
Copy link
Member Author

@Trott thanks for pointing this out. Even though I was able to get the test fully reliable running async as well, it did indeed increase the code complexity and it did not result in a significant runtime reduction. Therefore I fully reverted that test change to run sync again.

@BridgeAR
Copy link
Member Author

I won't start a new CI as I only reverted it to the old style.

@Trott Trott dismissed their stale review May 21, 2018 03:33

change request addressed

@BridgeAR BridgeAR added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label May 21, 2018
BridgeAR added a commit to BridgeAR/node that referenced this pull request May 21, 2018
This refactors some tests to reduce the runtime of those.

PR-URL: nodejs#20688
Refs: nodejs#20128
Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
Reviewed-By: Yang Guo <[email protected]>
@BridgeAR
Copy link
Member Author

Thanks.

Landed in 352ae23

@BridgeAR BridgeAR closed this May 21, 2018
MylesBorins pushed a commit that referenced this pull request May 22, 2018
This refactors some tests to reduce the runtime of those.

PR-URL: #20688
Refs: #20128
Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
Reviewed-By: Yang Guo <[email protected]>
@addaleax addaleax mentioned this pull request May 22, 2018
@MylesBorins
Copy link
Contributor

MylesBorins commented May 23, 2018

This was causing failures on AIX and had to be backed out of the 10.2.0 release. Worth exploring

edit: CI run https://ci.nodejs.org/job/node-test-commit-aix/15245/

@Trott
Copy link
Member

Trott commented May 24, 2018

This was causing failures on AIX and had to be backed out of the 10.2.0 release. Worth exploring

edit: CI run https://ci.nodejs.org/job/node-test-commit-aix/15245/

It also failed here:
https://ci.nodejs.org/job/node-test-commit-aix/15241/nodes=aix61-ppc64/console

@Trott
Copy link
Member

Trott commented May 24, 2018

Stress test: https://ci.nodejs.org/job/node-stress-single-test/1866/

EDIT: So far the stress test isn't showing anything, so let's try it with some actual parallelism:
https://ci.nodejs.org/job/node-stress-single-test/1867/

@Trott
Copy link
Member

Trott commented May 24, 2018

Interestingly, this change makes the test take much longer on AIX than the previous version. Prior to this change, parallel/test-async-wrap-pop-id-during-load took less than 2 seconds to run on AIX. With this change, it takes nearly 5 seconds.

@Trott Trott mentioned this pull request May 24, 2018
1 task
@gireeshpunathil
Copy link
Member

I had a look at the AIX failure. The child spawned with --stack_size=50 was throwing RangeError: Maximum call stack size exceeded even before entering the async-await sequence, so obviously it was catpured in

assert.strictEqual(ret.status, 0);

and asserted.

When I calibrated the stack size, I found 100 is a decent value to enter the test logic. However, it is purely empirical, and may not be reliable across systems and invocation sequences.

@targos targos removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label May 25, 2018
MylesBorins pushed a commit that referenced this pull request May 29, 2018
This refactors some tests to reduce the runtime of those.

PR-URL: #20688
Refs: #20128
Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
Reviewed-By: Yang Guo <[email protected]>
@MylesBorins MylesBorins mentioned this pull request May 29, 2018
@BridgeAR BridgeAR deleted the improve-tests branch January 20, 2020 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.