Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent failure in parallel/test-fs-watchfile on AIX #13377

Closed
mhdawson opened this issue Jun 1, 2017 · 16 comments
Closed

Intermittent failure in parallel/test-fs-watchfile on AIX #13377

mhdawson opened this issue Jun 1, 2017 · 16 comments
Labels
aix Issues and PRs related to the AIX platform. fs Issues and PRs related to the fs subsystem / file system. test Issues and PRs related to the tests.

Comments

@mhdawson
Copy link
Member

mhdawson commented Jun 1, 2017

  • Version: master
  • Platform: AIX
  • Subsystem: fs

I think I've seen this fail a few times recently:

https://ci.nodejs.org/job/node-test-commit-aix/6276/

not ok 433 parallel/test-fs-watchfile
  ---
  duration_ms: 60.193
  severity: fail
  stack: |-
    timeout
  ...
@mhdawson
Copy link
Member Author

mhdawson commented Jun 1, 2017

@mhdawson
Copy link
Member Author

mhdawson commented Jun 1, 2017

Looks like there was a recent change to the test: 3b12a8d

FYI - @Trott

@Trott
Copy link
Member

Trott commented Jun 1, 2017

Ack. Looking at it right now.

@mscdex mscdex added aix Issues and PRs related to the AIX platform. fs Issues and PRs related to the fs subsystem / file system. test Issues and PRs related to the tests. labels Jun 1, 2017
@Trott
Copy link
Member

Trott commented Jun 2, 2017

Replication on master using tools/test.py -j 64 --repeat 128 test/parallel/test-fs-watchfile.js: https://ci.nodejs.org/job/node-stress-single-test/1249/nodes=aix61-ppc64/console

So, still flaky under load. Will try to fix in place, but ultimately moving to sequential might help.

@refack
Copy link
Contributor

refack commented Jun 2, 2017

From the logs it seems like something "snaps", 50-60 iterations work, then it just stops.
Maybe try more file names, or unlinking the file and rewriting it.

@Trott
Copy link
Member

Trott commented Jun 2, 2017

I'm able to get it up to over 100 iterations if I switch to a large enough timer that calls itself. I tried using a backoff algorithm but that didn't seem to help much. I wonder if what's really going on is that at some point, the operating system stops handing out working watchers.

I think the solution is to move it to sequential where it's not competing with other resources, at least for now. fs.watch() is quirky so there's no shame in that.

@Trott
Copy link
Member

Trott commented Jun 2, 2017

Removed the interval and put the test in sequential. Let's stress test it on AIX and macOS and see what happens.

AIX: https://ci.nodejs.org/job/node-stress-single-test/nodes=aix61-ppc64/1258/console

macOS: https://ci.nodejs.org/job/node-stress-single-test/1260/nodes=osx1010/console

EDIT: Typo'ed on the path names. Will try again...

@Trott
Copy link
Member

Trott commented Jun 2, 2017

@Trott
Copy link
Member

Trott commented Jun 2, 2017

Failed with the interval missing. Put it back. That should fix it on macOS. Hopefully AIX too. Let's see.

AIX: https://ci.nodejs.org/job/node-stress-single-test/1263/nodes=aix61-ppc64/console
macOS: https://ci.nodejs.org/job/node-stress-single-test/nodes=osx1010/1265/console

@Trott
Copy link
Member

Trott commented Jun 2, 2017

Stress tests indicate this doesn't fix the flakiness on AIX. I'm starting to wonder if sometimes AIX hands us watchers that will simply never fire. The fs.watch() stuff is highly specific to the OS.

@nodejs/platform-aix

@gibfahn
Copy link
Member

gibfahn commented Jun 2, 2017

cc/ @gireeshpunathil

@gireeshpunathil
Copy link
Member

I just commented in #13385 thus:

ref

Watch facility on folders are not fool-proof in AIX, and we have skipped those tests which do that. I guess the new changes in
test-fs-watchfile.js is not catering to that.

#13111 made that change, it's original intent was to make sure the filename argument appears in the callback, when it fires - so I requested to include AIX as well, as the filename argument is availabe in that platform. But then the change introduced folder watch and hence the flaky result.

Proposals - one of:

  1. Amend the new change in test-fs-watchfile.js to skip AIX.
  2. Amend the new change in test-fs-watchfile.js to file watch as opposed to folder watch

@refack
Copy link
Contributor

refack commented Jun 2, 2017

Anybody on it?

@refack
Copy link
Contributor

refack commented Jun 2, 2017

Ref: #13248
Ref: #13251

Trott added a commit to Trott/io.js that referenced this issue Jun 2, 2017
Omitting AIX from `fs.watch()` portion of this test. It works
on AIX, but not reliably.

Fixes: nodejs#13377
@Trott
Copy link
Member

Trott commented Jun 2, 2017

Anybody on it?

Yup. #13385

@refack
Copy link
Contributor

refack commented Jun 2, 2017

Anybody on it?
Yup. #13385

I have a follow up to that in #13411

@Trott Trott closed this as completed in 98aa25c Jun 2, 2017
jasnell pushed a commit that referenced this issue Jun 5, 2017
Omitting AIX from `fs.watch()` portion of this test. It works
on AIX, but not reliably.

PR-URL: #13385
Fixes: #13377
Reviewed-By: Yuta Hiroto <[email protected]>
Reviewed-By: Michael Dawson <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
refack added a commit to refack/node that referenced this issue Jun 7, 2017
PR-URL: nodejs#13411
Refs: nodejs#13385
Refs: nodejs#13248
Refs: nodejs#13377
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
jasnell pushed a commit that referenced this issue Jun 7, 2017
PR-URL: #13411
Refs: #13385
Refs: #13248
Refs: #13377
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aix Issues and PRs related to the AIX platform. fs Issues and PRs related to the fs subsystem / file system. test Issues and PRs related to the tests.
Projects
None yet
6 participants