Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate a "send-and-exit" functionality in the VM for sending isolate messages #37835

Closed
2 of 3 tasks
mkustermann opened this issue Aug 13, 2019 · 7 comments
Closed
2 of 3 tasks
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-isolate

Comments

@mkustermann
Copy link
Member

mkustermann commented Aug 13, 2019

We would like to have a atomic send-and-exit functionality, which can send an object to a port and exit the isolate at the same time.

Until we have a shared heap, this can be implemented via merging of heap pages from the sender into the receiver. See e.g. cl/112900 for the basic functionality.

Once the heap is shared, this will be a NOP.

Missing items:

  • Ensure no Dart code is invoked after the verification pass was exectued
  • Possibly make validation procedure check for safepoints (to avoid very long safepointing latencies)
  • Have extensive Dart tests
@mkustermann mkustermann added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label Aug 13, 2019
@lrhn
Copy link
Member

lrhn commented Aug 14, 2019

We have a kill operation on isolates, so you can always kill yourself by Isolate.current.kill(priority: Isolate.immediate). That sets a precedent for killing isolates in the middle of computation. (So does Process.exit, and killing the process when the main isolate terminates).

Any "exit immediately" functionality is potentially dangerous, there might be finally blocks that are not executed. That could mean system resources that are not released properly. If we kill the entire process, then the operating system should handle things, but for individual isolates, we may have to do something ourselves.

Do we make any records of isolate resources to be cleaned up at isolate exit? If we merge heaps, the we might have to defer this clean-up until the merged isolate ends.

@aam
Copy link
Contributor

aam commented Aug 14, 2019

Do we make any records of isolate resources to be cleaned up at isolate exit? If we merge heaps, the we might have to defer this clean-up until the merged isolate ends.

Yes, Dart_WeakPersistentHandle for example. They will migrate to new isolate and will be cleaned up when this new owner, receiving isolate dies.

One problematic area @rmacnak-google pointed out is that embedders might rely on the fact that finalizers for those handles are executed on same thread that they were originally created on, same one used for the original "send-and-exit" isolate(OpenGL/graphics thread-sensitive resources of some sort, for example). They won't be happy if finalizers are called on potentially different thread associated with receiver's isolate.

@mkustermann
Copy link
Member Author

@lrhn The VM can always end stop an isolate without running finalizers due to various reasons. If we don't cleanup resources in such cases (which might be the case, e.g. for sockets), then it's an existing bug already.

If invoking the finalizers for weak persistent handles on different thread is an issue, then this is not a send-and-exit specific issue, but exists already: A garbage collection can happen on any thread (e.g. BG compiler thread doing a scavenge). The persistent handle finalizers are being invoked on the same thread that does GC, therefore we already have this "problem" (if it is one), right?

dart-bot pushed a commit that referenced this issue Jan 6, 2020
…parate table.

This is to support sharing of Field objects between isolate group's isolates.

Each isolate will have it's own field table with static instances/values.
field_table is stored on isolate, copied over to thread for easier access.

This also removes the write barrier from static field assignments; the field table is a GC root.

Bug: #37835
Bug: #36097
Change-Id: I232a86f059ac4cacc3e9f54bcc31d1d0524f9496
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/127582
Commit-Queue: Alexander Aprelev <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
dart-bot pushed a commit that referenced this issue Apr 4, 2020
…chable calls

Bug: #37835
Bug: #36097
Change-Id: I0198fd0328945b04e4f2254bacac25b41038e78c
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/138361
Commit-Queue: Alexander Aprelev <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
dart-bot pushed a commit that referenced this issue Apr 12, 2020
Speed up is achieved by sharing most of the dart code, object store
and class table between isolates in single isolate group. So
instead of bootstrapping isolate from the snapshot, isolate is
initialized by setting pointers to existing data structures already
set up for first isolate, and only few isolate-specific structures (moved
to newly introducted isolate_object_store) are created.

To allow for safe cross-isolate switchable call site, type test cache
mutations additional synchronization via RunWithStoppedMutators(that
relies on safepoints) was added.
Besides switchable call sites, no other mutation to the dart code is
done in AOT, which allows such sharing.

Bug: #37835
Bug: #36097
Change-Id: I64c86525f4ef9cb30567a49a106bfe700355942b
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/136780
Commit-Queue: Alexander Aprelev <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
dart-bot pushed a commit that referenced this issue Apr 17, 2020
This reverts commit 922ea3e in patchset 1, fix for assertion triggered in https://ci.chromium.org/b/8883214567628884960 in patchset 2, fix for deadlock around symbols table mutex in patchset 4.

Original commit description:

Speed up is achieved by sharing most of the dart code, object store
and class table between isolates in single isolate group. So
instead of bootstrapping isolate from the snapshot, isolate is
initialized by setting pointers to existing data structures already
set up for first isolate, and only few isolate-specific structures (moved
to newly introducted isolate_object_store) are created.

To allow for safe cross-isolate switchable call site, type test cache
mutations additional synchronization via RunWithStoppedMutators(that
relies on safepoints) was added.
Besides switchable call sites, no other mutation to the dart code is
done in AOT, which allows such sharing.

Bug: #37835
Bug: #36097
Change-Id: I655e337198214c9dfacbe76f7852b941b5a7e910
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/143462
Commit-Queue: Alexander Aprelev <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
dart-bot pushed a commit that referenced this issue Apr 22, 2020
sendAndExit allows for fast data passing from worker isolate back to
parent.

```
                                              | linux x64  | spawnIsolate | sendAndExit |
                                              |us per iter | over sync    | over send   |
                                              +------------+--------------+-------------+
IsolateJson.Decode50KBx1(RunTime):               43,175.000   339.83%
IsolateJson.SendAndExit_Decode50KBx1(RunTime):   22,070.000   124.83%        -48.88%
IsolateJson.SyncDecode50KBx1(RunTime):            9,816.284

IsolateJson.Decode50KBx4(RunTime):               77,630.000   104.56%
IsolateJson.SendAndExit_Decode50KBx4(RunTime):   46,307.000   22.02%         -40.35%
IsolateJson.SyncDecode50KBx4(RunTime):           37,949.528

IsolateJson.Decode100KBx1(RunTime):              71,035.000   270.42%
IsolateJson.SendAndExit_Decode100KBx1(RunTime):  43,056.000   124.52%        -39.39%
IsolateJson.SyncDecode100KBx1(RunTime):          19,176.733

IsolateJson.Decode100KBx4(RunTime):             120,915.000   54.66%
IsolateJson.SendAndExit_Decode100KBx4(RunTime):  67,101.000  -14.17%         -44.51%
IsolateJson.SyncDecode100KBx4(RunTime):          78,179.731

IsolateJson.Decode250KBx1(RunTime):             173,574.000  202.52%
IsolateJson.SendAndExit_Decode250KBx1(RunTime): 103,334.000   80.10%         -40.47%
IsolateJson.SyncDecode250KBx1(RunTime):          57,375.314

IsolateJson.Decode250KBx4(RunTime):             292,118.000   20.30%
IsolateJson.SendAndExit_Decode250KBx4(RunTime): 168,444.000  -30.63%         -42.34%
IsolateJson.SyncDecode250KBx4(RunTime):         242,831.000

IsolateJson.Decode1MBx1(RunTime):               631,578.000  166.34%
IsolateJson.SendAndExit_Decode1MBx1(RunTime):   371,127.000   56.50%         -41.24%
IsolateJson.SyncDecode1MBx1(RunTime):           237,135.778

IsolateJson.Decode1MBx4(RunTime):             1,322,789.000   36.16%
IsolateJson.SendAndExit_Decode1MBx4(RunTime):   657,179.000  -32.35%         -50.32%
IsolateJson.SyncDecode1MBx4(RunTime):           971,473.333

```

Bug: #37835
Bug: #36097
Change-Id: I386641e1431ed9f2e34fac36f562607a666ee4a8
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/142823
Commit-Queue: Alexander Aprelev <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
tekknolagi pushed a commit to tekknolagi/dart-assembler that referenced this issue Nov 3, 2020
…chable calls

Bug: dart-lang#37835
Bug: dart-lang#36097
Change-Id: I0198fd0328945b04e4f2254bacac25b41038e78c
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/138361
Commit-Queue: Alexander Aprelev <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
tekknolagi pushed a commit to tekknolagi/dart-assembler that referenced this issue Nov 3, 2020
Speed up is achieved by sharing most of the dart code, object store
and class table between isolates in single isolate group. So
instead of bootstrapping isolate from the snapshot, isolate is
initialized by setting pointers to existing data structures already
set up for first isolate, and only few isolate-specific structures (moved
to newly introducted isolate_object_store) are created.

To allow for safe cross-isolate switchable call site, type test cache
mutations additional synchronization via RunWithStoppedMutators(that
relies on safepoints) was added.
Besides switchable call sites, no other mutation to the dart code is
done in AOT, which allows such sharing.

Bug: dart-lang#37835
Bug: dart-lang#36097
Change-Id: I64c86525f4ef9cb30567a49a106bfe700355942b
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/136780
Commit-Queue: Alexander Aprelev <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
tekknolagi pushed a commit to tekknolagi/dart-assembler that referenced this issue Nov 3, 2020
This reverts commit 922ea3e in patchset 1, fix for assertion triggered in https://ci.chromium.org/b/8883214567628884960 in patchset 2, fix for deadlock around symbols table mutex in patchset 4.

Original commit description:

Speed up is achieved by sharing most of the dart code, object store
and class table between isolates in single isolate group. So
instead of bootstrapping isolate from the snapshot, isolate is
initialized by setting pointers to existing data structures already
set up for first isolate, and only few isolate-specific structures (moved
to newly introducted isolate_object_store) are created.

To allow for safe cross-isolate switchable call site, type test cache
mutations additional synchronization via RunWithStoppedMutators(that
relies on safepoints) was added.
Besides switchable call sites, no other mutation to the dart code is
done in AOT, which allows such sharing.

Bug: dart-lang#37835
Bug: dart-lang#36097
Change-Id: I655e337198214c9dfacbe76f7852b941b5a7e910
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/143462
Commit-Queue: Alexander Aprelev <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
@aam
Copy link
Contributor

aam commented Feb 19, 2021

sendAndExit was added to dart:_internal https://github.com/dart-lang/sdk/blob/master/sdk/lib/_internal/vm/lib/internal_patch.dart#L171 and is being tested and benchmarked:

 runtime/tests/vm/dart_2/sendandexit_test.dart
 runtime/tests/vm/dart_2/isolates/ring_gc_sendAndExit_test.dart

Is there anything that prevents us from moving this function to dart:isolate?

@mkustermann
Copy link
Member Author

Is there anything that prevents us from moving this function to dart:isolate?

Yes: The send-and-exit functionality only makes sense with --enable-isolate-groups, so we have to make that the default first, which requires us to finish the JIT support. Afterwards we'll have to look into what things we'd like to be able to send with send-and-exit, possibly more than we currently allow, and discuss this also with library team.

@mkustermann
Copy link
Member Author

mkustermann commented Oct 18, 2021

@aam Before closing this issue, we should - as discussed in the isolate sync today - ensure the verification pass in the send-and-exit has frequent safepoint checks in it - otherwise we might end up blocking other threads that want to safepoint for very long time (x00 ms).

Also: Disallow execution of dart code after send-and-exit: After c65556c landed, no more embedder API calls can execute dart code, but if there's dart<->C<->dart<->send-and-exit we might still end up executing dart code if the C code swallows unwind errors. Maybe we can do something about this.

@mkustermann
Copy link
Member Author

Have filed a different issue now for the safepoint-checkins with a reproduction: #49050

@aam aam closed this as completed Dec 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-isolate
Projects
None yet
Development

No branches or pull requests

3 participants