Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue passing file descriptors in OS X #7512

Closed
santigimeno opened this issue Jul 1, 2016 · 3 comments · Fixed by #7572
Closed

Issue passing file descriptors in OS X #7512

santigimeno opened this issue Jul 1, 2016 · 3 comments · Fixed by #7572
Labels
child_process Issues and PRs related to the child_process subsystem. macos Issues and PRs related to the macOS platform / OSX. net Issues and PRs related to the net subsystem.

Comments

@santigimeno
Copy link
Member

santigimeno commented Jul 1, 2016

  • Version: v7.0.0-pre
  • Platform: OS X 10.11.5
  • Subsystem: child_process

The test-cluster-net-send.js test is sometimes failing when running the test suite in my OS X with the following output:

=== release test-cluster-net-send ===                                          
Path: parallel/test-cluster-net-send
[32787] master
[32789] worker

assert.js:90
  throw new assert.AssertionError({
  ^
AssertionError: false == true
    at process.<anonymous> (/Users/sgimeno/node/node/test/parallel/test-cluster-net-send.js:29:12)
    at process.g (events.js:286:16)
    at emitOne (events.js:101:20)
    at process.emit (events.js:188:7)
Command: out/Release/node /Users/sgimeno/node/node/test/parallel/test-cluster-net-send.js

After investigating the issue it looks like that the error happens only when the fd that passes the worker to the master is closed before it is received in the master process. The following patch, that closes the fd only after receiving the NODE_HANDLE_ACK message fixes the issue for me.

diff --git a/lib/internal/child_process.js b/lib/internal/child_process.js
index 789c29e..44a245e 100644
--- a/lib/internal/child_process.js
+++ b/lib/internal/child_process.js
@@ -96,8 +96,8 @@ const handleConversion = {

     postSend: function(handle, options) {
       // Close the Socket handle after sending it
-      if (handle && !options.keepOpen)
-        handle.close();
+      //if (handle && !options.keepOpen)
+      //  global_handle = handle;
     },

     got: function(message, handle, emit) {
@@ -465,6 +465,11 @@ function setupChannel(target, channel) {
   target.on('internalMessage', function(message, handle) {
     // Once acknowledged - continue sending handles.
     if (message.cmd === 'NODE_HANDLE_ACK') {
+      if (target._pending_handle) {
+        target._pending_handle.close();
+        target._pending_handle = null;
+      }
+
       assert(Array.isArray(target._handleQueue));
       var queue = target._handleQueue;
       target._handleQueue = null;
@@ -615,8 +620,13 @@ function setupChannel(target, channel) {
       req.oncomplete = function() {
         if (this.async === true)
           control.unref();
-        if (obj && obj.postSend)
+        if (obj && obj.postSend) {
           obj.postSend(handle, options);
+          if (handle && !options.keepOpen) {
+            assert(!target._pending_handle);
+            target._pending_handle = handle;
+          }
+        }

This seems strange to me as my understanding was that closing the file descriptor after sending it was safe (at least in my Linux box I have not been able to reproduce the same issue). Thoughts?

@Fishrock123 Fishrock123 added macos Issues and PRs related to the macOS platform / OSX. fs Issues and PRs related to the fs subsystem / file system. labels Jul 1, 2016
@santigimeno santigimeno added the child_process Issues and PRs related to the child_process subsystem. label Jul 1, 2016
@mscdex mscdex added net Issues and PRs related to the net subsystem. and removed fs Issues and PRs related to the fs subsystem / file system. labels Jul 1, 2016
@bnoordhuis
Copy link
Member

Someone reported the same issue on the libuv mailing list recently: https://groups.google.com/d/msg/libuv/CuWJ28ZMmpY/YlIIeKdPBAAJ

I haven't been able to reproduce with OS X 10.8.5 or on other operating systems. It's possible it's a regression in newer xnu kernels, maybe a failure to increment the file description's reference count in SCM_RIGHTS operations.

I don't think we can work around that in libuv (it doesn't know when the file descriptor has been received) so a node.js workaround might be the best course of action.

@santigimeno
Copy link
Member Author

@bnoordhuis Thanks for the info. I'll follow with a PR later today. Is there a place to report this upstream?

@bnoordhuis
Copy link
Member

You mean to Apple? They have something called Radar but it's a massive black hole, don't expect any feedback.

santigimeno added a commit to santigimeno/node that referenced this issue Jul 26, 2016
There's an issue on some `OS X` versions when passing fd's between processes.
When the handle associated to a specific file descriptor is closed by the sender
process before it's received in the destination, the handle is indeed closed
while it should remain opened. In order to fix this behaviour, don't close the
handle until the `NODE_HANDLE_ACK` is received by the sender.
Added `test-child-process-pass-fd` that is basically `test-cluster-net-send` but
creating lots of workers, so the issue reproduces on `OS X` consistently.

Fixes: nodejs#7512
santigimeno added a commit to santigimeno/node that referenced this issue Aug 20, 2016
There's an issue on some `OS X` versions when passing fd's between processes.
When the handle associated to a specific file descriptor is closed by the sender
process before it's received in the destination, the handle is indeed closed
while it should remain opened. In order to fix this behaviour, don't close the
handle until the `NODE_HANDLE_ACK` is received by the sender.
Added `test-child-process-pass-fd` that is basically `test-cluster-net-send` but
creating lots of workers, so the issue reproduces on `OS X` consistently.

Fixes: nodejs#7512
PR-URL: nodejs#7572
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
evanlucas pushed a commit that referenced this issue Aug 24, 2016
There's an issue on some `OS X` versions when passing fd's between processes.
When the handle associated to a specific file descriptor is closed by the sender
process before it's received in the destination, the handle is indeed closed
while it should remain opened. In order to fix this behaviour, don't close the
handle until the `NODE_HANDLE_ACK` is received by the sender.
Added `test-child-process-pass-fd` that is basically `test-cluster-net-send` but
creating lots of workers, so the issue reproduces on `OS X` consistently.

Fixes: #7512
PR-URL: #7572
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
santigimeno added a commit to santigimeno/node that referenced this issue Oct 3, 2016
There's an issue on some `OS X` versions when passing fd's between processes.
When the handle associated to a specific file descriptor is closed by the sender
process before it's received in the destination, the handle is indeed closed
while it should remain opened. In order to fix this behaviour, don't close the
handle until the `NODE_HANDLE_ACK` is received by the sender.
Added `test-child-process-pass-fd` that is basically `test-cluster-net-send` but
creating lots of workers, so the issue reproduces on `OS X` consistently.

Fixes: nodejs#7512
Ref: nodejs#7572
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
@santigimeno santigimeno mentioned this issue Oct 3, 2016
4 tasks
MylesBorins pushed a commit that referenced this issue Oct 4, 2016
There's an issue on some `OS X` versions when passing fd's between processes.
When the handle associated to a specific file descriptor is closed by the sender
process before it's received in the destination, the handle is indeed closed
while it should remain opened. In order to fix this behaviour, don't close the
handle until the `NODE_HANDLE_ACK` is received by the sender.
Added `test-child-process-pass-fd` that is basically `test-cluster-net-send` but
creating lots of workers, so the issue reproduces on `OS X` consistently.

Fixes: #7512
Ref: #8904
PR-URL: #7572
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
MylesBorins pushed a commit that referenced this issue Oct 10, 2016
There's an issue on some `OS X` versions when passing fd's between processes.
When the handle associated to a specific file descriptor is closed by the sender
process before it's received in the destination, the handle is indeed closed
while it should remain opened. In order to fix this behaviour, don't close the
handle until the `NODE_HANDLE_ACK` is received by the sender.
Added `test-child-process-pass-fd` that is basically `test-cluster-net-send` but
creating lots of workers, so the issue reproduces on `OS X` consistently.

Fixes: #7512
Ref: #8904
PR-URL: #7572
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
rvagg pushed a commit that referenced this issue Oct 18, 2016
There's an issue on some `OS X` versions when passing fd's between processes.
When the handle associated to a specific file descriptor is closed by the sender
process before it's received in the destination, the handle is indeed closed
while it should remain opened. In order to fix this behaviour, don't close the
handle until the `NODE_HANDLE_ACK` is received by the sender.
Added `test-child-process-pass-fd` that is basically `test-cluster-net-send` but
creating lots of workers, so the issue reproduces on `OS X` consistently.

Fixes: #7512
Ref: #8904
PR-URL: #7572
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
MylesBorins pushed a commit that referenced this issue Oct 26, 2016
There's an issue on some `OS X` versions when passing fd's between processes.
When the handle associated to a specific file descriptor is closed by the sender
process before it's received in the destination, the handle is indeed closed
while it should remain opened. In order to fix this behaviour, don't close the
handle until the `NODE_HANDLE_ACK` is received by the sender.
Added `test-child-process-pass-fd` that is basically `test-cluster-net-send` but
creating lots of workers, so the issue reproduces on `OS X` consistently.

Fixes: #7512
Ref: #8904
PR-URL: #7572
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
child_process Issues and PRs related to the child_process subsystem. macos Issues and PRs related to the macOS platform / OSX. net Issues and PRs related to the net subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants