pummel/test-crypto-timing-safe-equal-benchmarks failing #34073

richardlau · 2020-06-26T14:01:10Z

Test: pummel/test-crypto-timing-safe-equal-benchmarks
Platform: Linux
Console Output:

08:05:52 not ok 66 pummel/test-crypto-timing-safe-equal-benchmarks
08:05:52   ---
08:05:52   duration_ms: 6.629
08:05:52   severity: fail
08:05:52   exitcode: 1
08:05:52   stack: |-
08:05:52     assert.js:385
08:05:52         throw err;
08:05:52         ^
08:05:52     
08:05:52     AssertionError [ERR_ASSERTION]: timingSafeEqual should not leak information from its execution time (t=7.814360810767208)
08:05:52         at Object.<anonymous> (/home/iojs/build/workspace/node-test-commit-custom-suites-freestyle/test/pummel/test-crypto-timing-safe-equal-benchmarks.js:109:1)
08:05:52         at Module._compile (internal/modules/cjs/loader.js:1217:30)
08:05:52         at Object.Module._extensions..js (internal/modules/cjs/loader.js:1238:10)
08:05:52         at Module.load (internal/modules/cjs/loader.js:1066:32)
08:05:52         at Function.Module._load (internal/modules/cjs/loader.js:954:14)
08:05:52         at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
08:05:52         at internal/main/run_main_module.js:17:47 {
08:05:52       generatedMessage: false,
08:05:52       code: 'ERR_ASSERTION',
08:05:52       actual: false,
08:05:52       expected: true,
08:05:52       operator: '=='
08:05:52     }
08:05:52   ...

Build Links:
https://ci.nodejs.org/job/node-test-commit-custom-suites-freestyle/15073/console started via https://ci.nodejs.org/job/node-daily-master/1949/

Daily master builds appear to have been consistently broken (for at least as long as we have history (27 May, we only keep a month's worth of results)) due to this test.

Previous issues with this test:

The text was updated successfully, but these errors were encountered:

addaleax · 2020-06-26T14:08:49Z

Fwiw, this has been discussed in the private repo for a bit since this is potentially security-relevant.

Trott · 2020-06-26T14:12:16Z

I bisected and it seems to be a result of a V8 update. Since we can't compile/test without most of the cherry-picks required for V8 updates, I was not able to identify a single commit. The first bad commit could be any of:

a489288
c3866a1
db0ed11
308900f
6b5ea2e
37abad4
55f7ae6
1d6adf7
9dfaf49
32be677

Trott · 2020-06-26T14:13:25Z

Some work done by @sam-github revealed that adding the patch below makes the test pass even though there's no reason it should change execution, in theory. Sam commented:

I suspect v8 of doing some kind of cacheing of strings passed to eval. It might even be a bug.

diff --git a/test/pummel/test-crypto-timing-safe-equal-benchmarks.js b/test/pummel/test-crypto-timing-safe-equal-benchmarks.js 
index b649b071e1e..bbe93d238b9 100644                                                                                                                                                          --- a/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
+++ b/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
@@ -23,7 +23,7 @@ function runOneBenchmark(compareFunc, firstBufFill, secondBufFill, bufSize)
-    `);
+    `.replace(';', '#').replace('#', ';'));
}

richardlau · 2020-06-26T14:15:21Z

Fwiw, this has been discussed in the private repo for a bit since this is potentially security-relevant.

Ah okay, I have no visibility of that. Given the state of daily master CI was discussed at the collaborator summit yesterday I was surprised no issue was open for it.

addaleax · 2020-06-26T14:16:50Z

Yeah, I think it’s fine to discuss it openly since the CI failures are available publicly as well.

Trott · 2020-06-26T14:17:32Z

Fwiw, this has been discussed in the private repo for a bit since this is potentially security-relevant.

Ah okay, I have no visibility of that. Given the state of daily master CI was discussed at the collaborator summit yesterday I was surprised no issue was open for it.

We'd been talking about moving it to a public repo anyway because it does not seem to be security-relevant once people dug into it a bit. So, this is probably as good a way for that to happen as any. Plus, I think the results are public anyway. (I'm not sure if node-daily-master results is viewable by anyone with a GitHub account. I think it is.)

richardlau · 2020-06-26T14:48:29Z

Some work done by @sam-github revealed that adding the patch below makes the test pass even though there's no reason it should change execution, in theory. Sam commented:

I suspect v8 of doing some kind of cacheing of strings passed to eval. It might even be a bug.
diff --git a/test/pummel/test-crypto-timing-safe-equal-benchmarks.js b/test/pummel/test-crypto-timing-safe-equal-benchmarks.js 
index b649b071e1e..bbe93d238b9 100644                                                                                                                                                          --- a/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
+++ b/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
@@ -23,7 +23,7 @@ function runOneBenchmark(compareFunc, firstBufFill, secondBufFill, bufSize)
-    `);
+    `.replace(';', '#').replace('#', ';'));
}

FWIW I can reproduce this on a Linux system inside IBM, where it's flaky:

-bash-4.2$ tools/test.py --repeat 10 test/pummel/test-crypto-timing-safe-equal-benchmarks.js
=== release test-crypto-timing-safe-equal-benchmarks ===
Path: pummel/test-crypto-timing-safe-equal-benchmarks
assert.js:385
    throw err;
    ^

AssertionError [ERR_ASSERTION]: timingSafeEqual should not leak information from its execution time (t=-14.106192455658816)
    at Object.<anonymous> (/home/users/riclau/sandbox/github/nodejs/test/pummel/test-crypto-timing-safe-equal-benchmarks.js:109:1)
    at Module._compile (internal/modules/cjs/loader.js:1217:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1238:10)
    at Module.load (internal/modules/cjs/loader.js:1066:32)
    at Function.Module._load (internal/modules/cjs/loader.js:954:14)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
    at internal/main/run_main_module.js:17:47 {
  generatedMessage: false,
  code: 'ERR_ASSERTION',
  actual: false,
  expected: true,
  operator: '=='
}
Command: out/Release/node /home/users/riclau/sandbox/github/nodejs/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
...
[00:40|% 100|+   2|-   8]: Done
-bash-4.2$

Applying the patch from @sam-github doesn't appear to help. Running --jitless does, but maybe that's just because it makes everything slower? Or maybe there is a JIT/optimization bug?

-bash-4.2$ git diff
diff --git a/test/pummel/test-crypto-timing-safe-equal-benchmarks.js b/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
index b649b07..7930213 100644
--- a/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
+++ b/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
@@ -1,3 +1,4 @@
+// Flags: --jitless
 'use strict';
 const common = require('../common');
 if (!common.hasCrypto)
-bash-4.2$ tools/test.py --repeat 10 test/pummel/test-crypto-timing-safe-equal-benchmarks.js
[01:40|% 100|+  10|-   0]: Done
-bash-4.2$

richardlau · 2020-06-26T15:17:04Z

Or this:

-bash-4.2$ git diff
diff --git a/test/pummel/test-crypto-timing-safe-equal-benchmarks.js b/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
index b649b07..bc831db 100644
--- a/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
+++ b/test/pummel/test-crypto-timing-safe-equal-benchmarks.js
@@ -1,3 +1,4 @@
+// Flags: --always_opt
 'use strict';
 const common = require('../common');
 if (!common.hasCrypto)
-bash-4.2$ tools/test.py --repeat 10 test/pummel/test-crypto-timing-safe-equal-benchmarks.js
[01:20|% 100|+  10|-   0]: Done
-bash-4.2$

(there's lots of --v8-options and I'll readily admit to not knowing what most of these do).

psmarshall · 2020-06-29T11:24:27Z

We can try and rule in/out v8 changes. As far as I can tell, this was bisected to the v8 8.3 upgrade, which was upgraded from 8.1? Using the version numbers there, here's the git log between the two versions: https://chromium.googlesource.com/v8/v8/+log/8.1.307.31..8.3.110.9/?pretty=fuller&n=10000

One thing that would be helpful to know from someone who can reproduce this: why does the T-test fail? Is it that the first execution is slow, and then the rest are very fast? That would support the 'something gets cached incorrectly' theory.

addaleax · 2020-06-29T17:37:08Z

I dug in a bit using V8 canary builds, since they provide a good way to bisect the combination of Node.js and V8 changes.

Fake git bisect log

canary20200222115d5495c2 [8.2.152-node.8]: good
canary2020032013d7e86e8c [8.3.35-node.8]: bad
canary20200303535dc9b07d [8.2.237-node.8]: bad
canary202002261c95df47d3 [8.2.184-node.8]: good
canary2020030169bd6793e5 [8.2.222-node.8]: bad
canary20200228c250e83364 [8.2.211-node.8]: good
canary20200229bc7ca2b453 [8.2.220-node.8]: good

That means the causing commit is one of https://chromium.googlesource.com/v8/v8/+log/8.2.220..8.2.222/?pretty=fuller&n=10000. v8/v8@a4c1408 is the only one that sticks out to me. Unfortunately, reverting that patch does not apply cleanly on top of our 8.3 upgrade PR, so I’m not able to confirm.

However! This appears something that has been fixed in one of the later V8 versions:

Fake git bisect log

canary2020032013d7e86e8c [8.3.35-node.8]: bad
canary202004229479c2ddd5 [8.4.157-node.8]: good
canary20200405a0ba79f8ed [8.4.0-node.8 (candidate)]: bad
canary20200413e68a631861 [8.4.49-node.8]: bad
canary20200418c06b589ced [8.4.131-node.8]: bad
canary202004192943fd49c0 [8.4.133-node.8]: bad
canary20200420e3ff8005c4 [8.4.134-node.8]: bad

Iiuc, that means that the fix is one of https://chromium.googlesource.com/v8/v8/+log/8.4.134..8.4.157/?pretty=fuller&n=10000. Unfortunately, that’s a lot of commits, and none of them particularly stand out to me as possibly related to this test or possibly related to one of the commits that have potentially caused this.

(This is all based on repeated runs of the test with up to 30 iterations. That’s not a lot, so there’s always a chance that I went wrong here somewhere.)

One thing that would be helpful to know from someone who can reproduce this: why does the T-test fail? Is it that the first execution is slow, and then the rest are very fast? That would support the 'something gets cached incorrectly' theory.

@psmarshall My understanding is that the code in the test is supposed to remove outliers, so that’s probably not it.

psmarshall · 2020-06-30T14:27:12Z

The fix might have been v8/v8@8c68536 which is in the fix range @addaleax provided. That would suggest the issue is sensitive to inlining. You could try configuring v8 with --no-turbo-inlining to turn it off and see if that fixes it even in the broken range?

addaleax · 2020-06-30T17:43:07Z

@psmarshall Yes, --no-turbo-inlining makes the test pass consistently.

My understanding is that v8/v8@8c68536 is an optimization, so even if it might do the right thing for us here, it still means that something about the test isn’t quite right and should be fixed on our side?

addaleax · 2020-06-30T20:22:28Z

So … I’m still trying to understand this. The test is very carefully written to avoid the influence of JS engine modifications, so I still don’t fully understand why they might influence this test one way or another.

One thing I’ve noticed is that this flakiness goes away when testing the binding function directly instead of the JS function, so one thing I will try is to move the typechecking that we do into C++ and see if that helps. It’s probably a good idea anyway for this particular case (and imo other cases as well but that’s way out of scope here).

This makes the function more robust against V8 inlining. Fixes: nodejs#34073

psmarshall · 2020-07-02T07:38:20Z

Running with --trace-opt --trace-deopt --trace-turbo-inlining should give a full picture of what got optimized when and how the inlining happened too. Maybe that gives more precise clues about what's going wrong.

The crypto.timingSafeEqual function is a bound C++ function so inlining shouldn't matter.... but it seems something with inlining is going on.

psmarshall · 2020-07-02T07:45:05Z

Oh just saw your PR @addaleax - that JS wrapper before the C++ function would do it for sure

This makes the function more robust against V8 inlining. Fixes: #34073 PR-URL: #34141 Reviewed-By: Richard Lau <[email protected]> Reviewed-By: Ujjwal Sharma <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: Rich Trott <[email protected]> Reviewed-By: Zeyu Yang <[email protected]>

richardlau added the flaky-test Issues and PRs related to the tests with unstable failures on the CI. label Jun 26, 2020

richardlau mentioned this issue Jun 26, 2020

test: update test-child-process-spawn-loop for Python 3 #34071

Merged

3 tasks

addaleax mentioned this issue Jun 30, 2020

crypto: move typechecking for timingSafeEqual into C++ #34141

Closed

3 tasks

addaleax added a commit to addaleax/node that referenced this issue Jul 1, 2020

crypto: move typechecking for timingSafeEqual into C++

6ff1747

This makes the function more robust against V8 inlining. Fixes: nodejs#34073

jasnell closed this as completed in 1d7be32 Jul 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pummel/test-crypto-timing-safe-equal-benchmarks failing #34073

pummel/test-crypto-timing-safe-equal-benchmarks failing #34073

richardlau commented Jun 26, 2020

addaleax commented Jun 26, 2020

Trott commented Jun 26, 2020

Trott commented Jun 26, 2020 •

edited

Loading

richardlau commented Jun 26, 2020

addaleax commented Jun 26, 2020

Trott commented Jun 26, 2020

richardlau commented Jun 26, 2020

richardlau commented Jun 26, 2020

psmarshall commented Jun 29, 2020

addaleax commented Jun 29, 2020

psmarshall commented Jun 30, 2020

addaleax commented Jun 30, 2020

addaleax commented Jun 30, 2020

psmarshall commented Jul 2, 2020

psmarshall commented Jul 2, 2020

pummel/test-crypto-timing-safe-equal-benchmarks failing #34073

pummel/test-crypto-timing-safe-equal-benchmarks failing #34073

Comments

richardlau commented Jun 26, 2020

addaleax commented Jun 26, 2020

Trott commented Jun 26, 2020

Trott commented Jun 26, 2020 • edited Loading

richardlau commented Jun 26, 2020

addaleax commented Jun 26, 2020

Trott commented Jun 26, 2020

richardlau commented Jun 26, 2020

richardlau commented Jun 26, 2020

psmarshall commented Jun 29, 2020

addaleax commented Jun 29, 2020

psmarshall commented Jun 30, 2020

addaleax commented Jun 30, 2020

addaleax commented Jun 30, 2020

psmarshall commented Jul 2, 2020

psmarshall commented Jul 2, 2020

Trott commented Jun 26, 2020 •

edited

Loading