Test count may not be as useful as it could be #43344

willm · 2022-06-08T04:52:42Z

Version

v18.3.0

Platform

Darwin willmunn-2 20.6.0 Darwin Kernel Version 20.6.0: Tue Feb 22 21:10:41 PST 2022; root:xnu-7195.141.26~1/RELEASE_X86_64 x86_64

Subsystem

test_runner

What steps will reproduce the bug?

I wanted to try the new test runner, being a long term tape user, this was pretty exciting as the apis are very similar. It seems the test summary at the end of the tap output is recording test counts as the number of test files. I had a go at a simple fizz buzz implementation:

import test from 'node:test';
import assert from 'assert';

import test from 'node:test';
import assert from 'assert';

const fizzbuzz = (num) => {
  const special = [
    {condition: num % 3 === 0, output: 'fizz'},
    {condition: num % 5 === 0, output: 'buzz'},
  ];
  const specialAnswer = special.reduce((output, x) => x.condition ? output + x.output : output, '');
  return specialAnswer ? specialAnswer : num;
};

test('1 should return 1', t => {
  assert.equal(fizzbuzz(1), 1);
});

test('3 should return fizz', t => {
  assert.equal(fizzbuzz(3), 'fizz');
});

test('5 should return buzz', t => {
  assert.equal(fizzbuzz(5), 'buzz');
});

test('15 should return fizzbuzz', t => {
  assert.equal(fizzbuzz(5), 'buzz');
});

test('16 should return 16', t => {
  assert.equal(fizzbuzz(16), 16);
});

node --test returns the following output:

TAP version 13
ok 1 - /Users/will.munn/code/experiments/node18/tests/test.js
  ---
  duration_ms: 0.07915145
  ...
1..1
# tests 1
# pass 1
# fail 0
# skipped 0
# todo 0
# duration_ms 0.126770722

note that the the output suggests that only 1 test has been run. After adding another test file, I noticed that the output is actually counting the number of files, not the number of test() blocks or the amount of assertions.

If I add a test.skip to one of the tests, it will still report that there are 0 skipped tests.

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior?

I personally think the best solution would be that The TAP output reports the number of test() blocks rather that the number of test files. So for my example:

TAP version 13
ok 1 - /Users/will.munn/code/experiments/node18/tests/test.js
  ---
  duration_ms: 0.07915145
  ...
1..5
# tests 5
# pass 5
# fail 0
# skipped 0
# todo 0
# duration_ms

What do you see instead?

TAP version 13
ok 1 - /Users/will.munn/code/experiments/node18/tests/test.js
  ---
  duration_ms: 0.07915145
  ...
1..1
# tests 1
# pass 1
# fail 0
# skipped 0
# todo 0
# duration_ms 0.126770722

Additional information

Note that my suggestion is actually different to what the tape module does, this reports counts based on the number of assertions. I personally feel that number of test blocks makes the most sense.

The text was updated successfully, but these errors were encountered:

tniessen · 2022-06-08T23:30:16Z

cc @nodejs/test_runner @cjihrig

cjihrig · 2022-06-10T00:56:14Z

This could be improved by parsing the standard output of the child processes that run each test file. There is a TODO in the code regarding implementing a TAP parser. IMO that is the best approach, but also requires the most work. If someone really wanted to, they could implement more light weight parsing that, for example, only parses the ending summary lines of each child process.

manekinekko · 2022-06-10T16:48:57Z

This could be improved by parsing the standard output of the child processes that run each test file. There is a TODO in the code regarding implementing a TAP parser.

@cjihrig happy to work on this. Could you point me to the TODO comment?

aduh95 · 2022-06-10T17:20:58Z

@manekinekko

node/lib/internal/main/test_runner.js

Lines 108 to 109 in adaf602

    
           // TODO(cjihrig): Implement a TAP parser to read the child's stdout 
        
           // instead of just displaying it all if the child fails.

manekinekko · 2022-06-10T18:44:20Z

Thank you @aduh95. I'm gonna work on this 👍

MoLow · 2022-07-05T05:58:17Z

I took a little look at this issue, and I was wondering if this can be approached by adding support for additional reporters other than the current TapStream,
this will allow adding a plain JSON reporter that is already very easy to serialize/deserialize
possibly another reporter can be using worker_threads postMessage to share data between the parent test and its children

this approach can also enable passing custom reporters that are implemented in userland as an option to the root test runner
(assuming we define a common interface for a reporter)
WDYT?

manekinekko · 2022-07-12T13:07:29Z

I agree with @MoLow!

this approach can also enable passing custom reporters that are implemented in userland as an option to the root test runner
(assuming we define a common interface for a reporter)

I am in favor of this as well.

MoLow · 2022-07-12T15:54:57Z

@cjihrig I'd love to hear your feedback regarding the reporter's approach.
@manekinekko would you like to implement some kind of parentPort.postMessage reporter?

cjihrig · 2022-07-12T15:57:53Z

I took a little look at this issue, and I was wondering if this can be approached by adding support for additional reporters other than the current TapStream

@cjihrig I'd love to hear your feedback regarding the reporter's approach.

In my head I always pictured the test runner outputting TAP via a stream and other reporters being implemented as transform streams (this part kind of depends on having the TAP parser in place).

manekinekko · 2022-07-12T16:03:09Z

would you like to implement some kind of parentPort.postMessage reporter?

@MoLow I am happy to give it a shot. Could you provide a high-level design so I have a bigger picture of the different pieces?

In my head I always pictured the test runner outputting TAP via a stream and other reporters being implemented as transform streams (this part kind of depends on having the TAP parser in place).

This would totally be doable once the TAP parser is done. The current AST has all the info to run it through a codegen and spit out virtually any format.

MoLow · 2022-07-12T16:46:03Z

In my head I always pictured the test runner outputting TAP via a stream and other reporters being implemented as transform streams (this part kind of depends on having the TAP parser in place).

I totally understand why implementing reporters as a transform stream makes sense,

my two arguments don't necessarily contradict that:

JSON is much more natural in javascript environments than TAP and can already be de/serealized out of the box, so I question TAP being the default output for a test?
I assume that it can be very hard to write a TAP parser that beats JSON.parse, both in terms of performance and of usability.
IMHO it makes more sense that the base stream outputs JSON objects and then use a transform stream to output TAP (in case one wants tap output and not something else)
the implementation of --test might leverage the ability to share memory between workers - which might be a little more efficient than serializing -> parsing -> serializing again
I am not sure about this one as multiple child processes can leverage multiple CPUs, and this wont allow that

ljharb · 2022-07-12T16:51:16Z

Test runners in the ecosystem don't output JSON by default - ever, afaik. A number of them output TAP by default, though. TAP is the best choice.

cjihrig · 2022-07-12T17:04:17Z

I assume that it can be very hard to write a TAP parser that beats JSON.parse

I also think that we need to think in terms of streaming output, which JSON.parse() would not handle. Streaming JSON parsing is possible, but I still think TAP is the right choice for the default output format.

MoLow · 2022-07-12T17:07:42Z

ok, I've got your point :)

kmannislands · 2022-11-10T01:24:24Z

I've been experimenting with the new node:test API and I had the same thoughts as @MoLow re: JSON.

Going with TAP as the native format seems to introduce unnecessary friction for consuming tools and it sounds like internally within node:test as well.

In order to make sense of the test results of a file's test run, you need a TAP parser. In order to implement a proper TAP parser, you need a YAML parser. And parsing YAML correctly is very non trivial.

In implementing a TAP parser myself, I've shied away from full YAML parsing and decided to implement a parser for the subset of yaml that jsToYaml can actually produce at the moment. But I can't help but feel like something is off here. YAMLs a complex format that's rather foreign in the node/js context. It seems node core would need a full yaml parser/encoder sub-lib in core to make this futureproof and that seems like quite a lift.

I also think that we need to think in terms of streaming output, which JSON.parse() would not handle. Streaming JSON parsing is possible, but I still think TAP is the right choice for the default output format.

Sure streaming JSON, but is that really the problem space here? I would imagine ndjson could be used with newlines delimiting results.

For a stream emitting something like:

{ testPlan: { start: 1, through: 2 } }
{ testNumber: 1, test: 'ok',  durationMs: 50.12312 }
{ testNumber: 2, test: 'not ok', durationMs: 50.12312, diagnostic: 'TODO # reason',  failureType: 'testCodeFailure' }

could be parsed pretty trivially with built-in functions (no 10k+ LOC yaml parsing lib dependency) as a stream:

const testNdJSONStream = run();

export async function* readableStreamLines(
    readableStream: NodeJS.ReadableStream
): AsyncGenerator<string> {
    for await (const tapBytes of readableStream) {
        if (!Buffer.isBuffer(tapBytes)) {
            throw new Error(
                `Didn't receive of bytes from TAPStream as expected`
            );
        }
        yield tapBytes.toString('utf-8');
    }
}

for await (const resultLine of readableStreamLines(testNdJSONStream)) {
  const testResult = JSON.parse(resultLine)
}

kmannislands · 2022-11-10T01:52:17Z

Another idea I had when considering all of this, node could potentially respect the TAP 13/14 spec while side-stepping the YAML parsing problem.

TAP 13 spec basically just specifies that the contents between --- and ... are valid YAML and not much else

Currently (2007/03/17) the format of the data structure represented by a YAML block has not been standardized.

What if node:test serializes contents in this block to a subset of flow style yaml instead of the subset of block style YAML that is presently being used?

This subset of flow style could also be valid JSON meaning that node:test would be outputting valid TAP 13 per the spec while still being easily parseable from js without a yaml lib.

Something like:

TAP version 13
# Subtest: /foo/bar/combinators.test.js
not ok 1 - Static HTML JSX Rendering
  ---
  { "duration_ms": 4.15925,  "failureType": 'subtestsFailed', "error": '1 subtest failed', "code": 'ERR_TEST_FAILURE' }
  ...
1..1

This is valid TAP 13, the embedded doc is valid YAML and it is far friendlier to parse from node.js.

Work in progress PR-URL: #43525 Refs: #43344 Reviewed-By: Franziska Hinkelmann <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>

MoLow · 2022-11-22T08:32:21Z

completed via #43525

Work in progress PR-URL: nodejs#43525 Refs: nodejs#43344 Reviewed-By: Franziska Hinkelmann <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>

Work in progress PR-URL: #43525 Refs: #43344 Reviewed-By: Franziska Hinkelmann <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>

Work in progress PR-URL: nodejs#43525 Refs: nodejs#43344 Reviewed-By: Franziska Hinkelmann <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>

Work in progress PR-URL: #43525 Refs: #43344 Reviewed-By: Franziska Hinkelmann <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>

Work in progress PR-URL: nodejs/node#43525 Refs: nodejs/node#43344 Reviewed-By: Franziska Hinkelmann <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Moshe Atlow <[email protected]> (cherry picked from commit f8ce9117b19702487eb600493d941f7876e00e01)

VoltrexKeyva added the test_runner Issues and PRs related to the test runner subsystem. label Jun 8, 2022

manekinekko mentioned this issue Jun 21, 2022

test_runner: add TAP parser #43525

Merged

8 tasks

MoLow mentioned this issue Jul 12, 2022

node:test - when running node --test the output is not a tap-spec valid output #43046

Closed

MoLow mentioned this issue Aug 9, 2022

Testrunner freezes / stops when setting up Express server #44189

Closed

MoLow mentioned this issue Aug 24, 2022

console.log in node:test ? #44372

Closed

MoLow mentioned this issue Nov 15, 2022

test_runner - using node test.mjs and node --test test.mjs print different outputs #45470

Closed

MoLow closed this as completed Nov 22, 2022

kmannislands mentioned this issue Dec 1, 2022

Separate subtest stdout from TAP output #45529

Closed

MoLow mentioned this issue Dec 3, 2022

test_runner: add reporters #45712

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test count may not be as useful as it could be #43344

Test count may not be as useful as it could be #43344

willm commented Jun 8, 2022 •

edited

Loading

tniessen commented Jun 8, 2022

cjihrig commented Jun 10, 2022

manekinekko commented Jun 10, 2022

aduh95 commented Jun 10, 2022

manekinekko commented Jun 10, 2022 •

edited

Loading

MoLow commented Jul 5, 2022

manekinekko commented Jul 12, 2022

MoLow commented Jul 12, 2022

cjihrig commented Jul 12, 2022

manekinekko commented Jul 12, 2022 •

edited

Loading

MoLow commented Jul 12, 2022

ljharb commented Jul 12, 2022

cjihrig commented Jul 12, 2022

MoLow commented Jul 12, 2022

kmannislands commented Nov 10, 2022

kmannislands commented Nov 10, 2022

MoLow commented Nov 22, 2022

Test count may not be as useful as it could be #43344

Test count may not be as useful as it could be #43344

Comments

willm commented Jun 8, 2022 • edited Loading

Version

Platform

Subsystem

What steps will reproduce the bug?

How often does it reproduce? Is there a required condition?

What is the expected behavior?

What do you see instead?

Additional information

tniessen commented Jun 8, 2022

cjihrig commented Jun 10, 2022

manekinekko commented Jun 10, 2022

aduh95 commented Jun 10, 2022

manekinekko commented Jun 10, 2022 • edited Loading

MoLow commented Jul 5, 2022

manekinekko commented Jul 12, 2022

MoLow commented Jul 12, 2022

cjihrig commented Jul 12, 2022

manekinekko commented Jul 12, 2022 • edited Loading

MoLow commented Jul 12, 2022

ljharb commented Jul 12, 2022

cjihrig commented Jul 12, 2022

MoLow commented Jul 12, 2022

kmannislands commented Nov 10, 2022

kmannislands commented Nov 10, 2022

MoLow commented Nov 22, 2022

willm commented Jun 8, 2022 •

edited

Loading

manekinekko commented Jun 10, 2022 •

edited

Loading

manekinekko commented Jul 12, 2022 •

edited

Loading