Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blocking isolate communication. #44804

Open
lrhn opened this issue Jan 29, 2021 · 31 comments
Open

Add blocking isolate communication. #44804

lrhn opened this issue Jan 29, 2021 · 31 comments
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-isolate type-enhancement A request for a change that isn't a bug

Comments

@lrhn
Copy link
Member

lrhn commented Jan 29, 2021

Dart provides isolates as a method of concurrency. Currently all cross-isolate communication is asynchronous, and Dart doesn't have any official way to turn an asynchronous computation into a synchronous result. The waitFor function of dart:cli tries to get around that, but in a way which is not expected to scale, and which can break abstractions and invariants inside a single isolate.

This is a proposal for a blocking receive-port feature which can block one isolate until it receives data from another isolate.

Blocking Receive-Port

A blocking receive-port is a low-level receive-port, like RawReceivePort, that you can ask for the next event synchronously, and it will block the entire isolate until such data becomes available. The receive-port will have to buffer events until they are asked for, so it's not as trivial as a RawReceivePort, but also not as complicated as the stream behavior of ReceivePort.

The class interface is defined as:

/// A receiver of port communication which can block until data arrives.
///
/// Like [RawReceivePort], this is a low-level object which can be used
/// to build more complicated communication frameworks on top of.
/// A blocking receive-port should be used with extreme care
/// since it can completely block an isolate if used incorrectly.
abstract class BlockingReceivePort {
  /// Creates a fresh blocking receive port.
  external factory BlockingReceivePort();
  
  /// A send-port that can be used to send an object to this receive-port.
  ///
  /// Objects sent on this send-port are received by that [next] method.
  /// If objects are sent before [next] is called, those objects will
  /// be bufferend until they are requested.
  SendPort get sendPort;
  
  /// Closes the receive port.
  ///
  /// Allows the receive port to discard any undelivered objects 
  /// instead of buffering until the next call to [next]
  ///
  /// Can be called repeatedly with no further effect.
  void close();
  
  /// Returns the next object sent to this receive port.
  ///
  /// If objects have already been received, 
  /// the least recently received one is returned immediately.
  /// 
  /// Throws a [TimeoutException] if [timeout] is provided,
  /// and no objects are received within the specified duration.
  ///
  /// Must not be called after calling [close] or [toRawPort].
  Object? next({Duration? timeout});
  
  /// The remaining objects which are sent to this port.
  /// 
  /// Closes this blocking receive port, as if by calling [close],
  /// but connects the [sendPort] to a new [RawReceivePort] instead.
  /// All further objects sent using the [sendPort], whether
  /// already buffered or sent later, are delivered through 
  /// the returned raw receive port instead.
  ///
  /// Must not be called after calling [close],
  /// and must not be called more than once.
  RawReceivePort toRawPort();
}

When a call to next() occurs, the entire isolate blocks. Incoming port events and timer events might be enqueued, but they won't trigger Dart code in the isolate. A control port event can potentially be handled unless it explicitly states that it won't happen until control returns to the event loop. This blocking lasts until an object is received on the blocking receive-port, at which point that object is returned from the call to next(). As such, the call to next() appears to be synchronous.

If objects have been buffered in the blocking receive-port prior to calling next(), it can continue immediately by dequeuing and returning the oldest object.

This blocking receive-port makes it possible to synchronously wait for asynchronous computations which happen in a different isolate, while not breaking the asynchronous model of the blocked isolate.

Isolate keep-alive

Currently any open receive-port (ReceivePort or RawReceivePort) keeps an isolate alive, because the isolate can potentially be made to do more work sending it a port message.

A BlockingReceivePort does not keep an isolate alive except while calling next(). If no-one has called next and the isolate is otherwise not being kept alive by anything, then an incoming event will not trigger any code to be run, so the isolate will still never do any further computation, and it can be shut down.

While an isolate is calling next on a blocking receive-port, the run-time system may be able to detect that no other live isolate has access to the send-port which can awaken the blocked isolate. In that case, the blocked isolate is effectively dead and can be disposed.

Example Use

We can build a synchronous run function from a blocking port, perhaps as a static on Isolate:

class Isolate {
  // ...
  static R run<R, A>(FutureOr<R> Function(A) function, A argument) {
    var port = BlockingReceivePort();
    Isolate.spawn(_Runner._run, _Runner(function, argument, port.sendPort));
    var result = port.next() as _Result<R>;
    port.close();
    return result.value;
  }
}

abstract class _Result<R> {
  R get value;
}

class _ResultValue<R> implements _Result<R> {
  final R value;
  _ResultValue(this.value);
}

class _ResultError implements _Result<Never> {
  final String _error, _stack;
  _ResultError(this._error, this._stack);
  Never get value => throw RemoteError(_error, _stack);
}

class _Runner<R, A> {
  final FutureOr<R> Function(A) function;
  final A argument;
  final SendPort port;
  _Runner(this.function, this.argument, this.port);
  void run() async {
    try {
      // `sendAndExit` is not publicly available yet.
      port.sendAndExit(_ResultValue<R>(await function(argument)));
    } catch (e, s) {
      port.sendAndExit(_ResultError("$e","$s"));
    }
  }
  static void _run(_Runner runner) {
    runner.run();
  }
}

If spinning up a new isolate per request is too much, we can also create a reusable runner (using the same helper classes).

class Runner {
  SendPort? _requestPort;
  final BlockingReceivePort _responsePort;
  
  Runner() : this._(BlockingReceivePort());
  Runner._(this._responsePort) : _requestPort = _createIsolate(_responsePort);
  
  static SendPort _createIsolate(BlockingReceivePort responsePort) {
    Isolate.spawn(_startListening, responsePort.sendPort);
    return responsePort.next() as SendPort;
  }
  
  static void _startListening(SendPort responsePort) {
    // Could use a BlockingReceivePort for requests too,
    // but this allows the isolate to be used for other 
    // things too, says synchronously starting a service
    // which then keeps running.
    var requestPort = RawReceivePort();
    requestPort.handler = (Object? o) {
      if (o == null) {
        requestPort.close();
      } else {
        (o as _Runner).run();
      }
    };
    responsePort.send(requestPort.sendPort);
  }
  
  R run<R, A>(FutureOr<R> Function(A) function, A argument) {
    var requestPort = _requestPort;
    if (requestPort == null) {
      throw StateError("Runner has been closed");
    }
    requestPort.send(_Runner<R, A>(function, argument, _responsePort.sendPort));
    return (_responsePort.next() as _Result<R>).value;
  }
  
  void close() {
    _requestPort?.send(null); // The kill command.
    _requestPort = null;
    _responsePort.close();
  }
}
@lrhn lrhn added area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-isolate type-enhancement A request for a change that isn't a bug labels Jan 29, 2021
@natebosch
Copy link
Member

While an isolate is calling next on a blocking receive-port, the run-time system may be able to detect that no other live isolate has access to the send-port which can awaken the blocked isolate. In that case, the blocked isolate is effectively dead and can be disposed.

We get questions about this type of behavior when the VM shuts down despite not running some code the user expects.
Should we explore having this scenario result in an exception in the isolate that called next?

@leafpetersen
Copy link
Member

cc @mkustermann @mraleph @aam

@mraleph
Copy link
Member

mraleph commented Feb 1, 2021

I think ergonomics of this solution is going to be rather lacking because it involves isolates. If I want to kick off some sort of computations and process results in the main isolate in a synchronous manner, I will have to deal with buffering and serialization. Similarly if I want to kick off some computation that needs to ask "main world" questions (e.g. query the state of some model). waitFor makes this trivial on the other hand. So it feel like we are taking a step backward, rather than forward. Most likely this approach will only address subset of waitFor use cases and not address use cases that SASS has.

Maybe we should instead consider being a bit more daring and go for something like Fibers in the language/core libraries? In this model waitFor just becomes a yield point for the fiber.

You do violate the invariant that synchronous Dart code runs to completion, but it is unclear to me if this invariant is worth it - or maybe expressive power of something like waitFor/fibers is more important.

@mkustermann
Copy link
Member

@lrhn Could you elaborate more on what the concrete motivation for this change is? To get rid of waitFor (for which there is probably < 10 users)?

@natebosch
Copy link
Member

To get rid of waitFor (for which there is probably < 10 users)?

Yes. The goal is to have a solution which we are comfortable encouraging more than 10 users to use.

@natebosch
Copy link
Member

Would performance concerns be mitigated if we keep pushing on sendAndExit? #37835

@mkustermann
Copy link
Member

To get rid of waitFor (for which there is probably < 10 users)?

Yes. The goal is to have a solution which we are comfortable encouraging more than 10 users to use.

Given the very low usage as well as the fact that this API comes with very hard to understand / unintended consequences (and can be considered breaking our event-loop based programming model), we should not encourage more than 10 users but rather get rid of it.

Do we have a good understanding of existing use cases and why it was chosen to use waitForFuture? Can the code be re-written to make callers be async? If the original reason for the waitForFuture was performance (i.e. async is too slow): Maybe it's no longer a problem, since we've made async/await 10x faster since then. There would need to be a really good reason for us to support such functionality in the first place.

Whether we would like to introduce synchronous isolate-to-isolate primitives to make concurrency in the Dart VM better is an orthogonal topic IMHO.

While an isolate is calling next on a blocking receive-port, the run-time system may be able to detect that no other live isolate
has access to the send-port which can awaken the blocked isolate. In that case, the blocked isolate is effectively dead and
can be disposed.

Should we then also implement a deadlock detector (like in Go) to debug bugs by detecting cycles in sync-waiting on ports? ...

@lrhn
Copy link
Member Author

lrhn commented Feb 3, 2021

The waitFor functionality was introduced as a stop-gap measure when we changed (by necessity) some isolate functionality to be asynchronous at the time we introduced the .packages file. That caused a problem for the Scss project which had a synchronous API depending on the functionality that now had to be asynchronous, and they were opposed to changing the API suddenly. (see #31102).

So, rewriting the code to be async was what this feature was introduced to avoid.

Various ideas were considered, and as I remember it, this was one which the VM team at the time accepted as an experiment which would avoid breaking Scss until the feature either proved itself to be a viable solution, or something better could be introduced. It was not intended as the final solution.

The SCSS API has not changed, so the issue is still there. Providing an alternative makes it more likely that we can remove the waitFor function and cli library. This is one idea for such a potential API, especially if we can make it performant.

@aam
Copy link
Contributor

aam commented Feb 3, 2021

Would performance concerns be mitigated if we keep pushing on sendAndExit? #37835

It doesn't sound like the concerns with async code were performance related. It sounds like folks tried to avoid cascading async api changes in their product.

@natebosch
Copy link
Member

It doesn't sound like the concerns with async code were performance related.

The original concerns were not performance related, however the sass authors do have performance concerns when we suggested using isolates for this. #39390 (comment)

@jacob314
Copy link
Member

Can you clarify the ways in which waitFor is not expected to scale? Are there known performance issues with waitFor or are the concerns that the model is hard to reason about correctly?

@natebosch
Copy link
Member

Are there known performance issues with waitFor or are the concerns that the model is hard to reason about correctly?

The model is hard to reason about correctly. Continued discussion about the waitFor should happen in #39390 though - let's keep this issue for discussion of blocking isolate communication.

@ds84182
Copy link
Contributor

ds84182 commented May 24, 2021

@lrhn Have you also considered limiting this to only the return result from the isolate's main function? e.g.

// Assume we have tuples, for brevity

// NOTE: Could also add a "bool sendAndExit = false" named parameter to use sendAndExit for the result instead of send internally.
static Future<(Isolate, R)> spawnWithResult<T, R>(FutureOr<R> Function(T) computation, T argument);
static (Isolate, R) spawnWithResultSync<T, R>(FutureOr<R> Function(T) computation, T argument);

// And also, as a shortcut when isolate spawning is blazingly fast (avoids the need for Futures):
static Isolate spawnSync<T>(void Function(T) computation, T argument);

I feel like exposing a blocking version of ReceivePort may lead us towards bugs that hang entire isolates, and may not be desired for things like Flutter. This assumes though you want to run a single computation, which in a lightweight-isolate world isn't that bad because spawning isolates is fast(er).

But exposing a result can help Isolate complexities in general, especially when writing abstractions on top of them. Currently you have to do:

// Again, assume tuples for brevity
void isolateMain(SendPort replyPort) {
  replyPort.send(RawReceivePort(_handlerHere).sendPort);
}

Future<(Isolate, RawReceivePort, SendPort)> initIsolate(Function handler) async {
  final completer = Completer<SendPort>();
  final port = RawReceivePort(completer.complete);
  final isolate = await Isolate.spawn(isolateMain, port.sendPort);
  final isolatePort = await completer.future;
  port.handler = handler;
  return (isolate, port, isolatePort);
}

when really, something like this could be possible:

// Again, assume tuples AND destructuring for brevity :)
SendPort isolateMain(SendPort replyPort) {
  return RawReceivePort(_handlerHere).sendPort;
}

Future<(Isolate, RawReceivePort, SendPort)> initIsolate(Function handler) async {
  final port = RawReceivePort(handler);
  final (isolate, isolatePort) = await Isolate.spawnWithResult(initIsolate, port.sendPort);
  return (isolate, port, isolatePort);
}

Your isolate example could use spawnWithResultSync for the first case, the second case isn't implementable with this model.

For implementation details, I'm thinking this could reuse the isolate's existing controlPort mechanism, which is currently used for OOB control messages and completely ignores values sent/received from other isolates. Or it could just be an abstraction around spawn + a temporary RawReceivePort.

@droplet-js
Copy link

droplet-js commented Mar 23, 2022

any news update? i want to use ffi.Pointer.fromFunction implement c/c++ function to send http request sync.

@mraleph
Copy link
Member

mraleph commented Mar 23, 2022

any news update? i want to use ffi.Pointer.fromFunction implement c/c++ function to send http request sync.

Why does it have to be sync? If it is in Flutter in the UI isolate then it is a very bad idea to do anything in a synchronous manner because it is going to lock up the whole UI.

@droplet-js
Copy link

droplet-js commented Mar 23, 2022

any news update? i want to use ffi.Pointer.fromFunction implement c/c++ function to send http request sync.

Why does it have to be sync? If it is in Flutter in the UI isolate then it is a very bad idea to do anything in a synchronous manner because it is going to lock up the whole UI.

i want to use ffi.Pointer.fromFunction delegate javascript XMLHttpRequest open/send

XMLHttpRequest

xhrReq.open(method, url);
xhrReq.open(method, url, async);
xhrReq.open(method, url, async, user);
xhrReq.open(method, url, async, user, password);
xhrReq.send(body)
    final JSObject http = JSObject.make(globalContext);
    http.setProperty(
      'send',
      JSObject.makeFunctionWithCallback(
        globalContext,
        name: 'send',
        callAsFunction: Pointer.fromFunction(_setupHttp),
      ).value,
    );
static JSValueRef _setupHttp(
    JSContextRef ctx,
    JSObjectRef function,
    JSObjectRef thisObject,
    int argumentCount,
    Pointer<JSValueRef> arguments,
    Pointer<JSValueRef> exception,
  ) {
    return convertJSObjectCallAsFunctionCallback(
      ctx,
      function,
      thisObject,
      argumentCount,
      arguments,
      exception,
      (JSContext context, JSObject function, JSObject thisObject, List<JSValue> arguments, JSException exception) {
        final String name = function.getProperty('name').string!;
        if (JSVmInject.hasVmId(context)) {
          if (name == 'send') {
            final XMLHttpRequestAdapter adapter = _globalCache[JSVmInject.getVmId(context)]!;
            // FIXME: sync or async http req
            return JSValue.makeUndefined(context);
          }
        }
        exception.fill(JSObject.makeError(context, arguments: <JSValue>[
          JSValue.makeString(context, string: '_flutter_jsc_jsvm_inject_http.$name not supported'),
        ]).value);
        return JSValue.makeNull(context);
      },
    );
  }

@mraleph
Copy link
Member

mraleph commented Mar 23, 2022

@v7lin I see you want to implement synchronous XHR send in Dart. Yeah, this is not currently possible.

@modulovalue
Copy link
Contributor

modulovalue commented Nov 13, 2022

I'm using waitFor to keep maintenance/utility scripts simple. Without it, with Futures, I'd be forced to propagate Futures everywhere which adds verbosity that I'm happily avoiding with waitFor.

My understanding of this issue is that this feature would allow me to block the current isolate, and cut the so to speak "Future-chain" by moving any given Future into a separate isolate and turn it into a blocking call? If that is the case, then I think this feature would great for Dart.


I'd like to point out a subtle, but very needed, use case for this feature in the Flutter ecosystem, which I don't think is obvious, and not possible today.

mraleph said:

[...] If it is in Flutter in the UI isolate then it is a very bad idea to do anything in a synchronous manner because it is going to lock up the whole UI.

mraleph refers to blocking the current isolate as being a bad idea, and it is (if it can be avoided). However, oftentimes it can't. The issue with dealing with Futures in Flutter is that once you have a Future, you have to rearchitect your application with the constraint that you can't have data on the first frame. That means that you can't easily move from a synchronous operation to an asynchronous one. Many Futures are quick to complete, and it might be more practical to block the application for a frame or two, than to spend a week on testing, approval, QA and so on, just to switch to a Future.


There are two important articles that are highly relevant to this issue: What Color is Your Function from Bob from the language team. And xi-editor retrospective from Raph who was on the Fuchsia team, now on the Google Fonts team who describes async as a "complexity multiplier".

This feature sounds like a good solution (even if not optimal, but I personally don't know of any better solutions) to the points that Bob and Raph raise in their articles.

Also, I feel like if Fuchsia FIDL calls will be asynchronous, then this feature will become even more needed (for all the reasons mentioned above).

@ntkme
Copy link
Contributor

ntkme commented Apr 19, 2023

We need this for dart-sass-embedded in order to get rid of waitFor without taking performance penalty.

Here is a summary of how we got here:

  1. Running fully asynchronously is not an option for dart-sass-embedded as it is way too slow. We are pretty much forced to use waitFor for asynchronous I/O communication between host and compiler during a synchronous compilation.
  2. Because waitFor runs another event loop on top of the current one, it causes stack overflow if too many waitFor calls are stacked on top of each other.
  3. @nex3 attempted to mitigate the stack overflow by running waitFor in different isolates, then it causes deadlock of the whole Dart VM when too many (around 16) isolates runs waitFor the same time.
  4. I did a proof of concept with main isolate running RawServerSocket and other isolates connecting to main isolate via RawSynchronousSocket instead of using ReceivePort. This allowed me to replace waitFor with synchronous I/O. No more stack overflow or deadlocks, but we got performance penalty of using a TCP socket instead of in memory Port.

Therefore, native blocking communication between isolates seems to be the only proper way to solve our issue.

sass/dart-sass#1959

@mkustermann
Copy link
Member

We need this for dart-sass-embedded in order to get rid of waitFor without taking performance penalty.
...
No more stack overflow or deadlocks, but we got performance penalty of using a TCP socket instead of in memory Port.

Dart VM offers a FFI that could be utilised to communicate data in-memory, synchronously with low overhead between isolates. The data being communicated between the isolates may need to be encoded to bytes (or other C data structures). So if the data being exchanged is protos / json / ... that would work. Though it does require custom C code.

(The limit on maximum number of isolates running in parallel can be avoided by making the native code exit the isolate before doing it's blocking call and re-entering the isolate after it. We are discussing how to make this more automatic when using our FFI.)

@ntkme
Copy link
Contributor

ntkme commented Apr 19, 2023

So if the data being exchanged is protos / json / ... that would work.

Yes it would work, but:

host process <- stdio -> dart-sass-embedded main isolate <- FFI -> child isolates running actual compilation 

In our design the main isolate multiplexes stdio for communication with the host process, we have to decode each inbound message on the main isolate to determine which child isolate to forward the message. With FFI, if we pass the original bytes to child isolates, we would need to decode the same message the second time in the child isolate. Thus it will be slower than passing already decoded Dart object via a Port.

Not to mention that need of writing C code just for this purpose is kind of awkward.

@mkustermann
Copy link
Member

In our design the main isolate multiplexes stdio for communication with the host process, we have to decode each inbound message on the main isolate to determine which child isolate to forward the message. With FFI, if we pass the original bytes to child isolates, we would need to decode the same message the second time in the child isolate. Thus it will be slower than passing already decoded Dart object via a Port.

If the communication protocol and message types are appropriately designed, this is rather low overhead. Let's say one multiplexes incoming messages to other isolates based on a receiver_id, then one can define the message proto e.g. like this:

message SassMessage {
  uint64 receiver_id;
  bytes payload;
}

Decoding such a proto is really fast, as it will not have to decode the contents of payload itself. Copying such payload to another isolate via C is probably much faster than sending the decoded value over a port, as former is a Uint8List copy & proto decode on receiver side whereas the ladder has to proto decode + transitively copy the decoded proto (SendPort.send(<message>) will make a transitive copy of <message> - isolates cannot access shared mutable state).

Not to mention that need of writing C code just for this purpose is kind of awkward.

The purpose would be to provide functionality that (intentionally **) isn't available in Dart core libraries. It could be a self-contained re-usable abstraction that provides synchronous communication primitives. IMHO not awkward at all - we have many users building on top of FFI + C code.

(**) As it wold be problematic for our wider ecosystem, especially flutter.

@nex3
Copy link
Member

nex3 commented Apr 19, 2023

I don't think FFI is a viable solution for us. Compiling custom C code for each operating system we target would require a massive amount of additional infrastructure overhead.

@ntkme
Copy link
Contributor

ntkme commented Apr 19, 2023

As it wold be problematic for our wider ecosystem, especially flutter.

Synchronous I/O API in an event driven system can always cause problem if used incorrectly. RawSynchronousSocket for normal socket connection has existed for a long time, which would be subject to the same concern for flutter. All I can find is a Warning in the RawSynchronousSocket documentation describing how it is intended to be used. Therefore, I think the "potential problem" is not really an issue, as long as we have another Warning in documentation describe how this is intended to be used.

Another potential safe guard can be that to limit the use of synchronous port to non-main isolates, so that it at least prevent user from locking up main isolate.

@lrhn
Copy link
Member Author

lrhn commented Apr 20, 2023

Preventing blocking the main isolate only makes sense for something like Flutter.
For a command-line script, it would be annoying to not be able to do things in the main isolate.
So any prevention of blocking of the main isolate should probably be opt-in by the embedder.

Using a blocking operation is always dangerous in latency-critical contexts, but we have them all over dart:io already. A File.readAsStringSync can also take a lot of time, and sleep(1000) is guaranteed to block for as long as you want it to. So is while (stopwatch.elapsedMilliseconds < 1000){}. Having such features is not the problem, they're just "use with care" functionality.

@mraleph
Copy link
Member

mraleph commented Apr 20, 2023

@nex3 @ntkme

Let us take a step back. I understand that asking you to include C code in a project which is otherwise a pure Dart project is a tall order (especially, because we don't even have a proper integration with C build-system yet). Let's forget about the C code for a bit and just talk about synchronous communication between isolates.

The solution that @mkustermann is proposing boils down to the following: using dart:ffi you can establish a synchronous communication channel between two or more isolates (depending on your needs), which would be almost as efficient as SendPort.

Here is an example of dispatcher-worker communication channel which should work on Linux and Mac OS X. We could also port the code to Windows.

Would something like this cover your needs? If so I would be happy to collaborate with you to help you migrate off waitFor to this lower level type of the communication channel.

On the dart:ffi side there are few things we could do to make things better (e.g. better support for converting pointers to typed data with finalizers and support for automatically exiting isolate when native functions - so that you can run more than 15 workers without deadlocking if needed), but I don't think any of these things are actually blocking. We can try to incorporate this into your code base and see how it performs. WDYT?

@ntkme
Copy link
Contributor

ntkme commented Apr 20, 2023

The problem of adding FFI to a pure Dart project is not really on whether we need to write C code or not. The real problem is that now we have to deal with cross-platform support ourselves rather than depends on the Dart VM to handle platform differences. For example, as you already pointed out pthread only works out of box on Unix-like OSes, so we have to port to Windows.

Today, we test cross-platform support with CI/CD as we don’t have any complex platform specific logic, but once we get into the territory of FFI, that changes the whole landscape - we need local access to multiple platforms for development work and testing.

On the other hand, you can argue that we can separate the FFI logic and ship it as a separate package and at least keep our main package pure Dart… However, it does not make much difference when developers run into issues and need to debug FFI related code.

@nex3
Copy link
Member

nex3 commented Apr 20, 2023

If the Dart team provided and maintained a package that supported blocking cross-isolate communication, whether it was implemented with dart:ffi or not, I think that would satisfy our needs as long as it supported all the same platforms as Dart (as well as musl linux, which we currently support using @ntkme's custom build of the SDK). Essentially this would mean that you'd take on the cross-platform testing and bug bashing work @ntkme describes above. I don't think we have the resources or expertise to do it ourselves, though.

That said, I don't think that would really be a better solution than providing dart:isolate support for this. As @lrhn points out, it's not especially higher risk for latency-sensitive applications than blocking operations.

Even a synchronous, non-blocking API would be helpful here—a way to poll a ReceivePort for queued messages a la select(2). In that case a blocking call would look more like while (!port.hasMessage) {} which is much more obviously going to lock up the isolate in question.

@mraleph
Copy link
Member

mraleph commented Apr 21, 2023

If the Dart team provided and maintained a package that supported blocking cross-isolate communication

I propose we strike a middle ground here: I think we can certainly release and own a package which will provide portable low-level synchronisation primitives (mutexes and condition variables) which will work and be tested across all Dart's officially supported platforms. However SASS will have to build suitable channel on top of this package in the same way as you use Atomics.* to build blocking communication channel on nodejs.

The reason I am proposing this is because I believe that there are many different ways to use such a channel (e.g. 1-1, 1-n, n-n relationship between send and receiving sides) and consequently many different ways to build it. However building an implementation for concrete use-case (e.g. SASS probably needs 1-1 channel) is much simpler. I am happy to help you build the implementation - but it better live in SASS code base.

Based on the discussions here I now believe that we have a clear migration path for SASS which means we are unblocked to move forward with waitFor removal. Consequently, I went ahead and filed #52121 to get the ball rolling.

@modulovalue
Copy link
Contributor

@mraleph Will it be possible to use the work on Lightweight Isolates & Faster isolate communication to make the solution that you have provided in this gist more efficient?

E.g. to simulate a synchronous Isolate.run:

// dart:isolate

/// ...
/// The result is sent using [exit], which means it's sent to this
/// isolate without copying.
/// ...
static Future<R> run<R>(FutureOr<R> computation(), {String? debugName}) ...

/// ...
/// The result is sent using [exit], which means it's sent to this
/// isolate without copying.
/// ...
static R runSync<R>(FutureOr<R> computation(), {String? debugName}) ...

From a usability point of view, I think that a way to unwrap Futures into a synchronous value, even if it is blocking the current isolate, is very useful. I've tried to capture some of my reasoning for why I believe that here.

@nex3
Copy link
Member

nex3 commented Apr 21, 2023

I propose we strike a middle ground here: I think we can certainly release and own a package which will provide portable low-level synchronisation primitives (mutexes and condition variables) which will work and be tested across all Dart's officially supported platforms. However SASS will have to build suitable channel on top of this package in the same way as you use Atomics.* to build blocking communication channel on nodejs.

Mutexes and condition variables aren't enough—we also need a way to transfer data. But if the package includes that and we don't need to use dart:ffi directly ourselves, that should be sufficient for our use-case.

I'll emphasize again, though, that I think having built-in blocking Isolate API support is likely to be much more user-friendly and a better fit with existing Dart metaphors.

Based on the discussions here I now believe that we have a clear migration path for SASS which means we are unblocked to move forward with waitFor removal. Consequently, I went ahead and filed #52121 to get the ball rolling.

I'm not comfortable with declaring Sass unblocked on removing waitFor until we have another solution built and battle-tested by our users. In order to deploy this potential solution, we need to know that we can fall back on our existing infrastructure until we're confident that this is a robust solution that will work for all our users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-isolate type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests