Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collection manager clearInternal instantiates keys during deletion #5053

Closed
warner opened this issue Apr 9, 2022 · 0 comments · Fixed by #5641
Closed

collection manager clearInternal instantiates keys during deletion #5053

warner opened this issue Apr 9, 2022 · 0 comments · Fixed by #5641
Assignees
Labels
bug Something isn't working SwingSet package: SwingSet
Milestone

Comments

@warner
Copy link
Member

warner commented Apr 9, 2022

While working on stopVat(), I found that my unit test was failing to release all of the durable objects that it was supposed to. I tracked it down to a non-ideal behavior in the virtual collection manager.

My test's setup included a durable collection (Map aka "Store") with a durable object as a key:

function setup() {
  const dc3 = makeScalarBigMapStore('dc3', { durable: true });
  const dur23 = makeDurableFoo(..);
  dc3.init(dur23, value);
  const rem1 = Far('rem1', { get: () => dc3 });
  return rem1;
}

During stopVat(), we go through several phases of deleting references that aren't supposed to survive the upgrade, one of which walks slotToVal to basically find all the Presences and Representatives that were held by userspace RAM, and we pretend that their finalizers have been fired. This releases all the memory pins on virtual/durable objects and on imports. Then we do a big bringOurYourDead that should free all of the thus-released objects. During this pass, liveslots notices the virtual collection being dropped, and uses deleteCollection() on it. That uses clearInternal to delete all its entries. clearInternal uses keys() to iterate over all the keys, and calls deleteInternal(key) on each one:

    function clearInternal(isDeleting, keyPatt, valuePatt) {
      let doMoreGC = false;
      for (const k of keys(keyPatt, valuePatt)) {
        doMoreGC = doMoreGC || deleteInternal(k);
      }
      if (!hasWeakKeys && !isDeleting) {
        syscall.vatstoreSet(prefix('|entryCount'), '0');
      }
      return doMoreGC;
    }

keys() needs to instantiate (unserialize) each key object, because the normal userspace usage of it should return these objects (for strong maps, at least). Likewise delete() needs to take a key object (deleteInternal is just delete without the .size -= 1 update). keys() needs to extract the encoded key value from the kvStore key, and unserialize that into a userspace-visible key object. delete() takes the userspace-visible key object and encodes it into a kvStore key.

So the act of deleting the collection causes the (re)creation of a bunch of Representatives for any virtual objects that were used as keys.

In my stopVat case, this instantiation of dur23 occurs after we've already scanned slotToVal and manually triggered the finalizer for each. The cleanup code is left with a dur23 Representative in RAM, after the last bringOutYourDead has been called, so it never gets reaped, and the data for dur23 is not deleted as it should be. This was complicated by the virtual-object cache, which despite a size = 0 still holds on to the last Representative that was created. My test builds some dummy ones to help flush the cache, but userspace does not (must not) get control after stopVat begins, so there's no opportunity for my test to make a dummy object after the dur23 Representative was re-instantiated. Even if I managed an additional bringOutYourDead, that vref would be pinned by the RAM pillar of the Representative.

The three problems of this approach to clearInternal are:

  • if the virtual Map key was the only reference to a Remotable, it should be collected in the same BOYD as the collection itself, but we won't call gcAndFinalize, so we'll hold on to the Remotable (and anything it holds) for too long
  • speed: we're unserializing every key object only to drop it again a moment later
  • re-instantiating key objects during deletion make it hard for stopVat to delete everything

Faster Approach

clearInternal does not really need to instantiate the keys that it's about to delete: it only does that because keys() is a conveniently-available mechanism to iterate over the keys, and delete is a conveniently-available mechanism to delete an entry (which requires an instantiated key).

The better approach (albeit not as tidy) would use an internal flavor of keys that only returns the encoded kvStore key, and an internal flavor of delete which accepts an encoded key instead of a key object. This would avoid instantiating the object, and would be faster as well.

The overall refactoring would be something like:

  • dbKeys() -> iterator of encoded key strings
  • deleteInternal(dbKey): decrefs value slots, key slot
  • delete(key): computes dbKey, calls deleteInternal, updateEntryCount(-1)
  • clear: uses dbKeys and deleteInternal

Alternate stopVat approaches

I've explored ways to work around this re-instantiation behavior in stopVat(), but I'm not optimistic.

deleteCollection and clearInternal and deleteInternal all have a doMoreGC flag to report that they just dropped a Remotable, because that means a subsequent gcAndFinalize might yield some more drops (things held by their RAM pillar). Our scanForDeadObjects keeps looping around gcAndFinalize and processing the deadset until this flag comes back false (and possiblyDeadSet is empty).

But they don't know that instantiating key objects might also necessitate another round of bringOutYourDead: the flag is returned false unless specifically a Remotable was dropped.

So one fix might be to change deleteCollection to always return true. A less busy solution would be for deleteInternal to always return true (that would avoid the churn for empty collections; not very interesting). Even less busy would be if clearInternal examined k (the key object) and set doMoreGC if it was a Remotable or a Representative: that would fix the first problem (the missing gcAndFinalize loop), but not the other two.

The cache retention problem means that the Representative might be retained in RAM anyways, and stopVat really needs to clear out everything. My finalizeEverything tool does that (the one that scans slotToVal and triggers all the finalizers, to shake loose the Representatives for the key objects), but I need to know when to call it, especially how many times to call it. I currently call it once, followed by bringOutYourDead. But if bringOutYourDead causes deletion, and deletion creates new RAM objects, then I must call finalizeEverything again, and then bringOutYourDead again, and the only way to know when it is safe to stop is to watch for the possiblyDeadSet to be empty. stopVat() lives outside scanForDeadObjects, not inside, so it's not really in a good position to manage this.

@warner warner added bug Something isn't working SwingSet package: SwingSet labels Apr 9, 2022
warner added a commit that referenced this issue Apr 9, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

refs #1848
warner added a commit that referenced this issue Apr 9, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

refs #1848
warner added a commit that referenced this issue Apr 9, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

refs #1848
warner added a commit that referenced this issue Apr 9, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

The collectionManager was changed to keep an in-RAM set of the vrefs
for all collections, both virtual and durable. We need the virtuals to
implement `deleteAllVirtualCollections` because there's no efficient
way to enumerate them from the vatstore entries, and the code is a lot
simpler if I just track all of them. We also need the Set to tolerate
duplicate deletion attempts: `deleteAllVirtualCollections` runs first,
but just afterwards a `bringOutYourDead` might notice a zero refcount
on a virtual collection and attempt to delete it a second time. We
cannot keep this Set in RAM: if we have a very large number of
collections, it violates our RAM budget, so we need to change our DB
structure to accomodate this need (#5058).

refs #1848
warner added a commit that referenced this issue Apr 9, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

The collectionManager was changed to keep an in-RAM set of the vrefs
for all collections, both virtual and durable. We need the virtuals to
implement `deleteAllVirtualCollections` because there's no efficient
way to enumerate them from the vatstore entries, and the code is a lot
simpler if I just track all of them. We also need the Set to tolerate
duplicate deletion attempts: `deleteAllVirtualCollections` runs first,
but just afterwards a `bringOutYourDead` might notice a zero refcount
on a virtual collection and attempt to delete it a second time. We
cannot keep this Set in RAM: if we have a very large number of
collections, it violates our RAM budget, so we need to change our DB
structure to accomodate this need (#5058).

refs #1848
warner added a commit that referenced this issue Apr 12, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

The collectionManager was changed to keep an in-RAM set of the vrefs
for all collections, both virtual and durable. We need the virtuals to
implement `deleteAllVirtualCollections` because there's no efficient
way to enumerate them from the vatstore entries, and the code is a lot
simpler if I just track all of them. We also need the Set to tolerate
duplicate deletion attempts: `deleteAllVirtualCollections` runs first,
but just afterwards a `bringOutYourDead` might notice a zero refcount
on a virtual collection and attempt to delete it a second time. We
cannot keep this Set in RAM: if we have a very large number of
collections, it violates our RAM budget, so we need to change our DB
structure to accomodate this need (#5058).

refs #1848
warner added a commit that referenced this issue Apr 12, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

The collectionManager was changed to keep an in-RAM set of the vrefs
for all collections, both virtual and durable. We need the virtuals to
implement `deleteAllVirtualCollections` because there's no efficient
way to enumerate them from the vatstore entries, and the code is a lot
simpler if I just track all of them. We also need the Set to tolerate
duplicate deletion attempts: `deleteAllVirtualCollections` runs first,
but just afterwards a `bringOutYourDead` might notice a zero refcount
on a virtual collection and attempt to delete it a second time. We
cannot keep this Set in RAM: if we have a very large number of
collections, it violates our RAM budget, so we need to change our DB
structure to accomodate this need (#5058).

refs #1848
warner added a commit that referenced this issue Apr 12, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

The collectionManager was changed to keep an in-RAM set of the vrefs
for all collections, both virtual and durable. We need the virtuals to
implement `deleteAllVirtualCollections` because there's no efficient
way to enumerate them from the vatstore entries, and the code is a lot
simpler if I just track all of them. We also need the Set to tolerate
duplicate deletion attempts: `deleteAllVirtualCollections` runs first,
but just afterwards a `bringOutYourDead` might notice a zero refcount
on a virtual collection and attempt to delete it a second time. We
cannot keep this Set in RAM: if we have a very large number of
collections, it violates our RAM budget, so we need to change our DB
structure to accomodate this need (#5058).

refs #1848
@warner warner self-assigned this Apr 13, 2022
@Tartuffo Tartuffo added this to the Mainnet 1 milestone Apr 15, 2022
@Tartuffo Tartuffo assigned FUDCo and unassigned warner Jun 9, 2022
@mergify mergify bot closed this as completed in #5641 Jul 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working SwingSet package: SwingSet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants