Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: skip test-worker-prof in debug builds #32596

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

HarshithaKP
Copy link
Member

Refs: #26401 (comment)

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the test Issues and PRs related to the tests. label Apr 1, 2020
@richardlau
Copy link
Member

I prefer the solution in #32589 where the test is marked SLOW. The containered debug build in the CI skips over those.

@HarshithaKP
Copy link
Member Author

@richardlau, can you tell the reason why you prefer that solution?

@richardlau
Copy link
Member

Because with that solution the decision as to whether to run the test or not is up to the actor running the test (e.g. either the user directly runs the test or runs the test through the test runner without --flaky-tests=skip (for debug builds --flaky-tests=skip will also skip tests marked as SLOW). If you wanted to run the test with a debug build with this PR you would need to edit the test.

@HarshithaKP
Copy link
Member Author

the decision as to whether to run the test or not is up to the actor running the test

The knowledge about whether this test is runnable in debug build or not, is known only to the test. Keeping such a test in the status file can confuse people - it might lead to a doubt that this is an issue which needs fixing. In this case, the proposed permanent solution is to skip it for debug builds. As per the investigation description, the test runs between tight tradeoffs, and the debug built is not fit to balance those.

@richardlau
Copy link
Member

the decision as to whether to run the test or not is up to the actor running the test

The knowledge about whether this test is runnable in debug build or not, is known only to the test.

I respectfully disagree. #26401 (comment) suggests plausible reasons why the test is unreliable in debug mode on the CI but not that the test doesn't work at all. For example it passed in https://ci.nodejs.org/job/node-test-commit-linux-containered/19131/nodes=ubuntu1804_sharedlibs_debug_x64/testReport/junit/(root)/test/sequential_test_worker_prof/. So whether the test works in debug mode or not likely depends on a number of factors (it may for example pass more frequently in a different configuration of CPU/memory). On the CI we have a fixed configuration that the debug builds are run under (the shared debug containers) but developers are likely to have other configurations so we should not make it harder for them to run a test that could work on their system should they wish to do so.

@HarshithaKP
Copy link
Member Author

@richardlau, so keeping the test as SLOW in test/root.status: when this status will change, and what will cause entry of this test to be removed from root.status?

@richardlau
Copy link
Member

@richardlau, so keeping the test as SLOW in test/root.status: when this status will change, and what will cause entry of this test to be removed from root.status?

The status would only change in the status file if the test was either removed completely from source control (e.g. it becomes obsolete) or if it was changed substantially to allow it to pass consistently on debug builds. It might never change from SLOW status, but that's okay. Being in the status file allows the actor (i.e. CI or user) to decide whether or not to run tests marked SLOW by setting the --flaky-tests command line option to the test runner without having to modify the test itself.

@HarshithaKP
Copy link
Member Author

@richardlau , does this mean that no tests should be skipped at the source, but controlled at the command line, by the actor? how do you classify tests that are skipped in source versus marked in the status file?

It might never change from SLOW status, but that’s okay

wont it make the test less reliable, forever?

@richardlau
Copy link
Member

richardlau commented Apr 7, 2020

@richardlau , does this mean that no tests should be skipped at the source, but controlled at the command line, by the actor? how do you classify tests that are skipped in source versus marked in the status file?

Tests skipped at source (via common.skip) would be:

  • if Node.js was compiled with different configuration options (e.g. without crypto)
  • if the test makes no sense to run on a particular platform (e.g. signal handling tests on Windows as Windows does not natively support signals)

Tests marked in the status file are tests that are flaky, i.e. they might pass but sometimes fail. In this case putting them in the status file means they can be run should the actor wish to do so without modifying any source.

It might never change from SLOW status, but that’s okay

wont it make the test less reliable, forever?

I don't follow. The test is already unreliable so adding it to the status file does not make it less reliable.

@HarshithaKP
Copy link
Member Author

@richardlau ,

If you wanted to run the test with a debug build with this PR you would need to edit the test.
Being in the status file allows the actor (i.e. CI or user) to decide whether or not to run tests marked SLOW by setting the --flaky-tests command line option to the test runner without having to modify the test itself.
In this case putting them in the status file means they can be run should the actor wish to do so without modifying any source.
The test is already unreliable so adding it to the status file does not make it less reliable.

Based on what you say, how will you explain these APIs, that used to skip tests at source level?
skipIf32Bits, skipIfEslintMissing, skipIfInspectorDisabled, skipIfWorker
Why the actor is not given an opportunity to take a decision in these cases, as opposed to the test?
grep SLOW root.status | wc -l gives me 100+ tests that are slow. Are these tests flakes, and can be / will be fixed?
why can't we have a skipIfDebug for tests that we know as time sensitive, and not practical to fix?
At least for this test test-worker-prof, I think it is not practical to run in debug mode. Here is the time difference (in linux):

$ time node test/sequential/test-worker-prof.js
real	0m1.750s
user	0m0.826s
sys	0m1.103s
$ time node_g test/sequential/test-worker-prof.js
real	0m25.137s
user	0m7.538s
sys	0m13.420s

this means, the chance of this test getting run faster enough to meet the test runner's expectation while meeting the test's own constraints (volume / worklaod nature) looks impossible to me.

@richardlau
Copy link
Member

@richardlau ,

If you wanted to run the test with a debug build with this PR you would need to edit the test.
Being in the status file allows the actor (i.e. CI or user) to decide whether or not to run tests marked SLOW by setting the --flaky-tests command line option to the test runner without having to modify the test itself.
In this case putting them in the status file means they can be run should the actor wish to do so without modifying any source.
The test is already unreliable so adding it to the status file does not make it less reliable.

Based on what you say, how will you explain these APIs, that used to skip tests at source level?
skipIf32Bits, skipIfEslintMissing, skipIfInspectorDisabled, skipIfWorker
Why the actor is not given an opportunity to take a decision in these cases, as opposed to the test?
grep SLOW root.status | wc -l gives me 100+ tests that are slow. Are these tests flakes, and can be / will be fixed?
why can't we have a skipIfDebug for tests that we know as time sensitive, and not practical to fix?
At least for this test test-worker-prof, I think it is not practical to run in debug mode. Here is the time difference (in linux):

For those API's there's no chance of the test passing if the condition being tested fails (e.g. if the inspector is disabled the inspector tests will always fail regardless of how fast/slow your system is). Also apart from skipIf32Bits the other conditions cannot be evaluated in the status files.

As I said earlier the difference with this test is that it might pass in debug mode, which is not the same as always failing.

$ time node test/sequential/test-worker-prof.js
real	0m1.750s
user	0m0.826s
sys	0m1.103s
$ time node_g test/sequential/test-worker-prof.js
real	0m25.137s
user	0m7.538s
sys	0m13.420s

this means, the chance of this test getting run faster enough to meet the test runner's expectation while meeting the test's own constraints (volume / worklaod nature) looks impossible to me.

Um, you've just demonstrated that the test passed for you in debug mode (with node_g) which is why I've been arguing that the test is flaky in debug mode and not always failing. Here's a daily-master CI run from 10 days ago (before #32589 landed) which also shows the test passing in debug mode: https://ci.nodejs.org/job/node-test-commit-linux-containered/19154/nodes=ubuntu1804_sharedlibs_debug_x64/testReport/(root)/test/sequential_test_worker_prof/

@HarshithaKP
Copy link
Member Author

Um, you’ve just demonstrated that the test passed for you in debug mode (with node_g) which is why I’ve been arguing that the test is flaky in debug mode and not always failing.

The test passed for me because I ran it directly (no python harness, so no timeout). If you allow it to run that way, it will always pass.

@richardlau
Copy link
Member

It still passed in the CI link I posted. It also passes for me on a very fast Linux system with the Python test harness in debug mode.

@BridgeAR
Copy link
Member

What's the status here?

@HarshithaKP
Copy link
Member Author

I still think disabling in the source is the right way, due to:
i) The logic used in the program has too many tradeoffs that makes it difficult to tune for debug builds
ii) Too many tests in root.status makes it practically non-actionable

@@ -1,5 +1,9 @@
'use strict';
const common = require('../common');

if (process.features.debug)
common.skip('cannot reliably test this in debug builds');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚙️

Suggested change
common.skip('cannot reliably test this in debug builds');
common.skip('In debug builds, it is not possible to conduct a reliable test.');

Copy link

@Mifrill Mifrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the skip is a possible way to go, it seems like this PR is outdated and can be closed since this test is already marked as slow, see: #32589 (comment)

📝

sequential/test-worker-prof: SLOW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants