Finish kernel-side retirement of C-list entries for resolved promises #2110

FUDCo · 2020-12-21T21:48:16Z

These changes implement step 3 of the plan laid out in #2065

warner

Great work. The main change I want to make is to remove the clist entries before giving the vat control, which will also cause a few of the tests to change slightly (expectRetirement). The secondary change would be to simplify/augment the unit tests to exercise the resolution cycles more directly (using a technique I didn't think of the first time through, so we'll have to write some new code together).

warner · 2021-01-04T22:56:38Z

packages/SwingSet/src/kernel/cleanup.js

+  }
+
+  scanKernelPromise(rootKPID, rootKernelData);
+  return Array.from(seen);


This got me thinking about determinism and how this list will be ordered, and whether we ought to consider sorting the seen array. I believe Array.from(set) will use the insertion order from the set, which means the depth-first traversal order of the resolved-promise graph (filtered by promises known to the vat). This should be consistent across replicas, and doesn't reveal any information that the vat isn't about to learn anyways. But in terms of a specification (and thinking about a non-JavaScript implementation of the kernel), is the insertion-order behavior sufficiently well-established that we could replicate it in something other than javascript?

My hunch is that we don't need to sort this, but I wanted to hear your thoughts.

Oh, this is also being used to build the resolutions object. And we need to think about the order in which the vat observes the resolutions it processes: the vat needs to hear about a new promise before it hears about its resolution. If it got the resolution first (for a promise it didn't know about), .. hm, I don't know what that would do, it might work, but I think all of our existing code expects to be told about a promise in one crank, and then get the fulfillment in a subsequent one.

In the new world, a single crank gets a batch of resolutions, but the vat is still processing those resolutions in a particular order, and the same expectations apply.

This might be a reason to change resolutions from an object (in which the vpids are the keys) into an array (of [vpid, rejectionFlag, resolutionData] 3-tuples), just to make it clear that there's an expected order in which they should be processed. I don't know that I'd want to rely upon JS's insertion order being carried all the way through (JSON serialization, separate processes, potentially multiple languages, etc).

Array.from(set) is actually Array.from(iterable) and relies on Set's iteration order for the array order. The iteration order of a Set (as well as the various other collection classes and collection-like things) is specified in ECMA-262 to be insertion order, which is the order we want here because it corresponds to graph traversal order, which also means dependency order, which I think is necessary for correct functioning. If you were to sort the array as a separate step, you'd need a comparator that could compare dependency order, which I don't see how to implement short of keeping an explicit record of the depth-first traversal order, which Set already does (you could, for example, accumulate the array as you go instead of doing an explicit sort at the end, but the scanning algorithm needs to do random-access lookups, which would require maintaining the Set also for queries, but if you do that you don't need the array due to the aforementioned Set iteration semantics).

I'd note that while depth-first graph traversal order yields a dependency ordering, depth-first order is unique while dependency order is not necessarily unique since dependency order is a lattice, so if you were to implement this in another language that didn't have something like Set with JavaScript's ordering rules, you might have to do the work yourself -- so we might need to explicitly specify that the intervat representation of a notification presents the promises in some defined dependency ordering like depth-first, even though that's just what the JavaScript path of least resistance delivered us for free.

My intuition now leans towards thinking the resolutions object should be changed to an array. (I actually had the intuition it should be an array at the beginning but I let you talk me out of it because my sense that an array was better was due to a sort of cargo-cult notion that all the other things we send are arrays so this one should be too; I wasn't really thinking about the importance of ordering -- and random-access is convenient.) As long as we are in the JavaScript universe, we are OK because JavaScript objects also obey the insertion-order-is-iteration-order rule, but once we encode things to JSON it's possible that a non-JavaScript JSON implementation might re-order things, since JSON objects are explicitly unordered (or, more pedantically, the JSON spec takes no position on whether or not a JSON implementation will impute meaning to the ordering of the properties in a JSON object) whereas JSON arrays are explicitly ordered.

Aye. I guess it's fine to have the order determined by the graph being traversed, that's plenty deterministic.

I suppose the one benefit from using an Object or Map or something else key-value-ish is the enforced property that each promise-id can only be included once, whereas with an Array a misbehaving function might produce one with duplicate keys. But I think the fact that ordering matters here should push us into preferring the Array despite that minor drawback.. using an Array makes it abundantly clear to the reader that the creator of resolutions has a specific order in mind, whereas relying upon JS's (well-specified but non-obvious) insertion ordering is not nearly as discoverable. Some auditor down the road will save time if we use the Array for this, and that's enough reason to do it.

warner · 2021-01-05T02:30:56Z

packages/SwingSet/src/kernel/kernel.js

@@ -472,7 +472,16 @@ export default function buildKernel(
      kernelKeeper.incStat(statNameForNotify(p.state));
      const kd = harden(['notify', kpid, p]);
      const vd = vat.translators.kernelDeliveryToVatDelivery(kd);
-      await deliverAndLogToVat(vatID, kd, vd);
+      if (vd) {
+        await deliverAndLogToVat(vatID, kd, vd);


I think we should delete the clist entries here: after translation but before calling deliverAndLogToVat, because otherwise a vat which attempts to reference the vrefs that shall not be uttered will succeed, when that ought to be a fatal error. We don't have to trigger the GC walk of what the perhaps-now-entirely-unreferenced promises pointed to, but I think we need to at least delete the clist entries before giving control to the vat.

It might also make sense to augment kernelDeliveryToVatDelivery somehow, to return a list of kpids rather than digging the vpids out of the vatDelivery object and then translating them back into krefs. Or maybe vatKeeper.deleteCListEntry could take just one kind and look up the matching one on its own. OT3H our vat-side nondeterministic GC code might introduce unidirectional clist entries (deleting one direction but leaving the other in place until some further event happens), so maybe it's better to not introduce an assumption that both entries are always present.

Yah, the re-lookup of the kpid in the vatKeeper's vat-to-kernel mapping table kind of bugged me too, but it seemed like a lot of work to get rid of it so I didn't bother. But if it's done through a lookup like this, it strikes me that once we are properly handling the cleanup on the liveslots side, the code here simply won't work for deleting the clist entries after calling deliverAndLogToVat because after it is done, the requisite mapping table entries will be gone and the mapVatSlotToKernelSlot lookups will fail.

yeah, as discussed in the meeting we should change vatKeeper to have a method likedeleteCListsForKpids(kpids) or something, to allow the translation to take place as close as possible to the data it needs

As I've been working through this change, I discovered that the notion of deleting the clist entries after translation but before delivery will not work given the current implementation of liveslots, because during the processing of the notify, the vat generates ephemeral subscribe requests that mention promises that are going away. If the rule is "never speak of this again after this crank", then deleting the clist entries after the notify delivery (i.e., the way it works now) is fine, but if we want to have the vpids immediately become unmentionable, we'll have to devise a different way to manage these ephemeral subscribe syscalls. I'm inclined to want to do that, since these subscribes are kinda stupid, but the mechanism is not obvious.

I think we worked this out, by having liveslots keep track of which promises were introduced during deserialization and which are being resolved by the crank, and only subscribing to the remainder. So the vat should never be subscribing to vpids that are introduced-and-resolved.

warner · 2021-01-05T02:51:25Z

packages/SwingSet/src/kernel/vatTranslator.js

@@ -79,12 +79,27 @@ function makeTranslateKernelDeliveryToVatDelivery(vatID, kernelKeeper) {

  function translateNotify(kpid, kp) {


I'm vaguely unsettled that "translate" now changes the shape of the thing so completely. I wonder if it would be tidyier to build a kernelDelivery(type=notify) with a full resolutions batch (all using kpids), by crawling the Promise graph and checking kpids against the target vat's clist (for the "do we need to tell them now" question), and then give translateNotify the one job of converting all the krefs to vrefs.

It would be kind of silly, in that we'd be checking the same vat's clist twice (once from the kref side and again in the translation function), but it would maybe make things easier to read. And the first clist check would only be looking for presence-vs-absence ("is this kpid in this vat's clist?"), while the second check is the one that actually translates krefs to vrefs.

Thoughts?

That would also remove the slightly-squirrly double "do nothing" pathways in kernel.js processNotify(), where it bails out either because 1: the target vat is dead, or 2: the translator discovered that the target vat no longer cares. If we rewrote this as above, processNotify would learn that resolutions.length === 0 before calling the translator, so the translator would no longer have a null return case.

One of the things I discovered when I was writing the slogulator was that it turns out that lots of these translations change the shape of the thing being translated. At first it bugged me and I was going to "fix" them, but the phenomenon was sufficiently widespread that changing the situation proved difficult and pointless and so I gave up. So adding one more example of the phenomenon didn't seem to me to be incrementally worse. The notify which does nothing (because the thing it was going to notify already has been) is a little whacky, but I don't see that it's a problem per se that needs to be fixed.

I like the shape we came up with in the meeting, let's see how that feels once it gets implemented. I kind of wish the translators were less mutating, but of course their primary purpose is to add entries to the clist when necessary. I do like the thing we came up with where the translator might add an entry, but never deletes them on its own; rather the non-vat-specific kernel code which creates the delivery object is responsible for any deletion that needs to happen afterwards.

warner · 2021-01-05T15:57:42Z

packages/SwingSet/src/vats/comms/dispatch.js

    }
+    resolveFromKernel(promiseID, resolution);
    // XXX question: do we need to call retirePromiseIDIfEasy (or some special


I think this will be ok for now: if the kernel retires the vpid as it resolves the promise into the comms vat, but the comms vat doesn't forget the vpid the way it's supposed to, and then later somebody on the far end of the wire re-mentions that same promise, the comms vat will re-introduce it to the kernel. That will re-export the promise.

The main problem I can imagine is if the comms vat doesn't realize that the kernel expects a resolution for the new copy of the old promise. We'll be doing a pass through the comms vat before this whole task is done, and I think we'll fix any problems during that pass.

warner · 2021-01-05T15:59:09Z

packages/SwingSet/src/vats/comms/dispatch.js

+    // get a bogus dispatch.notify. Currently we throw an error, which is
+    // currently ignored but might prompt a vat shutdown in the future.
+
+    // TODO: The following goofiness, namely taking apart the capdata object


during our upcoming "make the comms protocol support batched resolutions too" task (or however we do it), this goofiness might be replaced with something less goofy

warner · 2021-01-05T17:00:01Z

packages/SwingSet/test/test-vpid-kernel.js

@@ -595,7 +578,7 @@ async function doTest4567(t, which, mode) {
    t.is(clistKernelToVat(kernel, vatA, p1kernel), undefined);
    t.is(clistVatToKernel(kernel, vatA, p1VatA), undefined);
  } else {
-    t.is(inCList(kernel, vatA, p1kernel, p1VatA), true);
+    t.is(inCList(kernel, vatA, p1kernel, p1VatA), false);


This seems weird.. if the item should be missing from the kernel clist in both expectRetirement and !expectRetirement cases, then let's move this clause out of the if block, rather than making the same test in both arms. But how is it that inCList() is expected to return false at the same time that clistKernelToVat/clistVatToKernel are expected to return values? I don't see how that could ever pass, which makes me think that we're no longer exercising the !expectRetirement case at all (which is what we want in the long run: we should retire every resolved promise, so sooner or later we should remove support for testing the non-retiring case).

Right, expectRetirement is now unconditionally true, so this second clause isn't being exercised, so the mismatch is not causing a failure.

Let's remove the expectRetirement flag entirely, and delete else arm.

The expectRetirement flag is a vestigial leftover from when we were first implementing promise retirement stuff, so that we had the option to run with and without promise retirement (without breaking the tests) until we were satisfied that it was solid. Now that we've firmly embraced it, I think it can be safely discarded.

should we discard it now? or do you have a followup PR to do that?

warner · 2021-01-05T17:01:03Z

packages/SwingSet/test/test-vpid-kernel.js

@@ -575,7 +558,7 @@ async function doTest4567(t, which, mode) {
  };
  onDispatchCallback = function odc1(d) {
    t.deepEqual(d, resolutionOf(p1VatA, mode, targetsA));
-    t.is(inCList(kernel, vatA, p1kernel, p1VatA), !expectRetirement);
+    t.is(inCList(kernel, vatA, p1kernel, p1VatA), expectRetirement);


This seems weird, and I suspect that expectRetirement is no longer being exercised correctly. If we expect the item to be retired, why would we assert that it's still in the clist?

Oh, right, this is related to my recommendation that we remove the clist entry before giving control to the vat. Since the current branch is not doing that, this callback (which runs while the vat has control) will observe the clist entry to be present (independent of expectRetirement). Then, a few lines later, after the dispatch has completed, we test the clist again, and expect the entry to be removed.

To match the behavior of the code, this line should change to use true instead of expectRetirement.

If we can remove the clist entry before giving control to the vat, then this test should change to false instead of expectRetirement.

warner · 2021-01-05T17:42:14Z

packages/SwingSet/test/test-vpid-liveslots.js

+    data: capargs([slot0arg], [pbA]),
+  });
+  t.deepEqual(log, []);
+


I'm guessing this is incomplete.. let's talk this afternoon about what needs to be exercised here.

same question, did we write a more complete test for this? I remember we walked through some code, just want to make sure that effort didn't get lost.

warner · 2021-01-05T17:55:28Z

packages/SwingSet/test/test-vpid-kernel.js

@@ -608,3 +591,240 @@ for (const caseNum of [4, 5, 6, 7]) {
    });
  }
 }
+
+test(`kernel vpid handling crossing resolutions`, async t => {


I think we can simplify this to use just a single vat, and build promise data up in the kernel directly (rather than using a separate vat to construct it). Let's walk through it today.

did we write up a simpler test to exercise this?

warner

Looks good, a few minor comment suggestions. I would like to make sure we aren't losing the new/simpler tests we walked through, though.

packages/SwingSet/src/kernel/vatManager/deliver.js

warner · 2021-01-18T09:23:38Z

packages/SwingSet/test/test-vpid-kernel.js

@@ -595,7 +578,7 @@ async function doTest4567(t, which, mode) {
    t.is(clistKernelToVat(kernel, vatA, p1kernel), undefined);
    t.is(clistVatToKernel(kernel, vatA, p1VatA), undefined);
  } else {
-    t.is(inCList(kernel, vatA, p1kernel, p1VatA), true);
+    t.is(inCList(kernel, vatA, p1kernel, p1VatA), false);


should we discard it now? or do you have a followup PR to do that?

warner · 2021-01-18T09:24:26Z

packages/SwingSet/test/test-vpid-kernel.js

@@ -608,3 +591,240 @@ for (const caseNum of [4, 5, 6, 7]) {
    });
  }
 }
+
+test(`kernel vpid handling crossing resolutions`, async t => {


did we write up a simpler test to exercise this?

warner · 2021-01-18T09:25:36Z

packages/SwingSet/test/test-vpid-liveslots.js

+    data: capargs([slot0arg], [pbA]),
+  });
+  t.deepEqual(log, []);
+


same question, did we write a more complete test for this? I remember we walked through some code, just want to make sure that effort didn't get lost.

FUDCo · 2021-01-22T09:11:42Z

Closes #2065

FUDCo added enhancement New feature or request SwingSet package: SwingSet labels Dec 21, 2020

FUDCo requested a review from warner December 21, 2020 21:48

FUDCo self-assigned this Dec 21, 2020

FUDCo force-pushed the promise-gc-phase2-step3 branch from 8901c6b to 49cd39d Compare December 22, 2020 23:40

FUDCo marked this pull request as draft December 22, 2020 23:41

FUDCo force-pushed the promise-gc-phase2-step3 branch from 49cd39d to f6c17af Compare December 22, 2020 23:42

FUDCo marked this pull request as ready for review December 24, 2020 08:15

FUDCo marked this pull request as draft December 24, 2020 08:15

FUDCo force-pushed the promise-gc-phase2-step3 branch from 6bfde2e to 5394eeb Compare December 24, 2020 08:16

warner requested changes Jan 5, 2021

View reviewed changes

This was referenced Jan 8, 2021

Garbage collect resolved promises #1124

Closed

marshal: add context argument to valToSlot/slotToVal #2171

Open

FUDCo force-pushed the promise-gc-phase2-step3 branch 2 times, most recently from a031034 to f3d0c76 Compare January 12, 2021 00:01

FUDCo marked this pull request as ready for review January 12, 2021 23:13

FUDCo requested a review from warner January 12, 2021 23:15

warner approved these changes Jan 18, 2021

View reviewed changes

FUDCo added 2 commits January 21, 2021 17:07

feat: retire C-list entries for resolved promises

872aff7

test: add tests for cross referencing promises across vats

714dae4

FUDCo force-pushed the promise-gc-phase2-step3 branch from f3d0c76 to 29e1012 Compare January 22, 2021 01:29

feat: refactor notification and subscription

0627792

FUDCo force-pushed the promise-gc-phase2-step3 branch from 29e1012 to 0627792 Compare January 22, 2021 01:37

FUDCo merged commit dd5f7f7 into master Jan 22, 2021

FUDCo deleted the promise-gc-phase2-step3 branch January 22, 2021 01:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finish kernel-side retirement of C-list entries for resolved promises #2110

Finish kernel-side retirement of C-list entries for resolved promises #2110

FUDCo commented Dec 21, 2020

warner left a comment

warner Jan 4, 2021

warner Jan 5, 2021

FUDCo Jan 5, 2021 •

edited

Loading

warner Jan 6, 2021

warner Jan 5, 2021

FUDCo Jan 5, 2021

warner Jan 6, 2021

FUDCo Jan 6, 2021 •

edited

Loading

warner Jan 18, 2021

warner Jan 5, 2021

warner Jan 5, 2021

FUDCo Jan 5, 2021

warner Jan 6, 2021

warner Jan 5, 2021

warner Jan 5, 2021

warner Jan 5, 2021

warner Jan 5, 2021

FUDCo Jan 5, 2021

warner Jan 18, 2021

warner Jan 5, 2021

warner Jan 5, 2021

warner Jan 5, 2021

warner Jan 18, 2021

warner Jan 5, 2021

warner Jan 18, 2021

warner left a comment

warner Jan 18, 2021

warner Jan 18, 2021

warner Jan 18, 2021

FUDCo commented Jan 22, 2021

		@@ -79,12 +79,27 @@ function makeTranslateKernelDeliveryToVatDelivery(vatID, kernelKeeper) {

		function translateNotify(kpid, kp) {

Finish kernel-side retirement of C-list entries for resolved promises #2110

Finish kernel-side retirement of C-list entries for resolved promises #2110

Conversation

FUDCo commented Dec 21, 2020

warner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

FUDCo Jan 5, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

FUDCo Jan 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

warner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

FUDCo commented Jan 22, 2021

FUDCo Jan 5, 2021 •

edited

Loading

FUDCo Jan 6, 2021 •

edited

Loading