Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

efficient DB enumeration of virtual collections for stopVat-time deletion #5058

Open
warner opened this issue Apr 9, 2022 · 4 comments
Open
Labels
enhancement New feature or request liveslots requires vat-upgrade to deploy changes SwingSet package: SwingSet

Comments

@warner
Copy link
Member

warner commented Apr 9, 2022

What is the Problem Being Solved?

As part of #1848 (in particular #5056) I need stopVat() to enumerate and delete all virtual collections. I couldn't find a good way to locate them all from just the DB, so I introduced an in-RAM Set with one vref per collection. stopVat() walks this set, checks each to see whether it's virtual or durable, and calls the existing deleteCollection() for the virtual ones. This has the nice property of decrementing refcounts and thus dropping imports and durables that were only referenced by the late collections. Also, the use of a Set makes it easy for deleteCollection() to ignore duplicate deletes, such as what happens when the following scanForDeadObjects sees zero refcounts for the collections and attempts to delete them a second time.

That set was a decent short-term fix, but a zillion virtual/durable collections causes O(N) RAM usage, which violates our RAM budget goals.

To fix this, we need a way to enumerate all the virtual collections from an efficient DB query. Each virtual collection has a collectionID (an integer, shared among collections of all kinds, some of which are virtual, others are durable). Each collection has a batch of keys like vc.${collectionID}.|${metadataKey}, which holds the collection size, a label, a nextOrdinal counter for the numbered-because-they're-references keys, and the vref-to-ordinal mapping for those keys. Those are joined by a O(collectionsize) batch of keys like vc.${collectionID}.${encodedKey} for the actual entries. In addition, each virtual object (of which a tiny subset are collections) has a refcount key in vom.${baseref}.

One way to enumerate all collections would be a vatstoreGetAfter() starting at vc., extract the collectionID from the first key it finds, then skip ahead to vc.${collectionID+1}. That's O(N) in the number of live collections, but feels like a horrible hack. Another would be to count up from collectionID = 1 to the current next-allocation value and probe each for a well-known metadata key like vc.${collectionID}.|label, however that's O(N) in the number of historically allocated collections (i.e. if we've created and deleted a lot of collections, we have to walk through all of the deleted ones too).

Neither approach helps us figure out the kind of the collection, which is what we need to determine whether it's virtual or durable. The collection DB keys don't record the kind or the durability status. Somewhere out there is a vref for the collection, in the form o+${kindIndex}/${collectionID}, and we can look up the Kind index in a table to find out about it's status (there are only a small number of Kinds, and we keep information about all of them in RAM). But we don't track a list of all instances of a given kind.

Given the current DB structure, the best approach I can think of is to start with a phase that filters out all the virtual Kinds from the table in RAM, then searches the vom.${baseref} section for refcount records that start with the o+${kindIndex}/ prefix for each one. That would give us a list of collections (of that Kind) which are referenced by any form of virtualized data. However the collection object might be held only by RAM, in which case there won't be a refcount record for it, so we'd have to somehow merge this information with the export-status records (es.${baseref}) as well as walking slotToVal to find the in-RAM pillars.

Description of the Design

So what I'm looking for is a DB structure that makes it cheap to find (and delete) all the virtual-object and virtual-collection records, without assistance from an in-RAM table whose cardinality equals the number of collections.

One simple starting point would be a section of valueless DB keys which only exist to help us find the collections, maybe a vcc.${collectionID} -> kindID for each one. We'd add the key upon creation of the collection, and delete it if/when the collection is deleted. To delete all the virtual collections, we'd walk that prefix (getting both virtual and durable collections), look up the kindID on each to see which type it is, and then delete just the virtuals.

We could avoid the cost of looking at durables by splitting the table into two pieces: vc-virtual.${collectionID} and vc-durable.${collectionID}. The kindID value is less interesting then, but probably useful for something else later.

But in the long run we want the kernel to be able to delete this data (efficiently) without help from the vat. To support that, I think we want to split the entire vatstore into virtual and durable subsets. Either we double the vatstore API to have a virtual get/set/has/delete/getAfter, or we add a durable flag to the existing calls. The kernel would use a different prefix to separate out the two categories, maybe ${vatID}.vs.d.${key} for the durables and ${vatID}.vs.v.${key} for the virtuals. The kernel could then delete all non-durable data with a single prefix. And if/when we finally manage to move to SQLite for the kernelDB, these could go into entirely separate tables, and a single super-efficient SQL operation would wipe out the whole thing. (Of course, if the kernel deletes the virtuals like this, it must do a mark-and-sweep on the durables to figure out which ones were fatally dereferenced by the virtuals, but we need that anyways to clear out everything properly).

Security Considerations

none I can think of

Test Plan

unit tests

@warner warner added enhancement New feature or request SwingSet package: SwingSet labels Apr 9, 2022
warner added a commit that referenced this issue Apr 9, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

The collectionManager was changed to keep an in-RAM set of the vrefs
for all collections, both virtual and durable. We need the virtuals to
implement `deleteAllVirtualCollections` because there's no efficient
way to enumerate them from the vatstore entries, and the code is a lot
simpler if I just track all of them. We also need the Set to tolerate
duplicate deletion attempts: `deleteAllVirtualCollections` runs first,
but just afterwards a `bringOutYourDead` might notice a zero refcount
on a virtual collection and attempt to delete it a second time. We
cannot keep this Set in RAM: if we have a very large number of
collections, it violates our RAM budget, so we need to change our DB
structure to accomodate this need (#5058).

refs #1848
warner added a commit that referenced this issue Apr 9, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

The collectionManager was changed to keep an in-RAM set of the vrefs
for all collections, both virtual and durable. We need the virtuals to
implement `deleteAllVirtualCollections` because there's no efficient
way to enumerate them from the vatstore entries, and the code is a lot
simpler if I just track all of them. We also need the Set to tolerate
duplicate deletion attempts: `deleteAllVirtualCollections` runs first,
but just afterwards a `bringOutYourDead` might notice a zero refcount
on a virtual collection and attempt to delete it a second time. We
cannot keep this Set in RAM: if we have a very large number of
collections, it violates our RAM budget, so we need to change our DB
structure to accomodate this need (#5058).

refs #1848
@warner
Copy link
Member Author

warner commented Apr 9, 2022

cc @FUDCo any ideas?

@FUDCo
Copy link
Contributor

FUDCo commented Apr 9, 2022

I don't think you need a data structure. The O(N) iteration you're afraid of is exactly what you need, because you have to iterate over all those keys to delete them anyway. You don't have to probe for the gaps in the collection ID space. If you start iterating from vc.1. but the first actual collection that exists is collectionID 427, then the immediate return from your first call to vatstoreGetAfter will have the key vc.427.${whatever}. You can then find the next collection by resuming iteration from vc.428..

With respect to finding out the kind and durability status of a collection, it would be very easy to add a vc.${collectionID}.|metadata key to each, and arguably we should have that anyway.

@FUDCo
Copy link
Contributor

FUDCo commented Apr 9, 2022

Actually, the iteration is slightly trickier because you will have to take into account that the DB iterates in lexicographic order rather than numeric order, but we don't actually care about the specific order the collections get cleared out in as long as we don't miss any.

warner added a commit that referenced this issue Apr 12, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

The collectionManager was changed to keep an in-RAM set of the vrefs
for all collections, both virtual and durable. We need the virtuals to
implement `deleteAllVirtualCollections` because there's no efficient
way to enumerate them from the vatstore entries, and the code is a lot
simpler if I just track all of them. We also need the Set to tolerate
duplicate deletion attempts: `deleteAllVirtualCollections` runs first,
but just afterwards a `bringOutYourDead` might notice a zero refcount
on a virtual collection and attempt to delete it a second time. We
cannot keep this Set in RAM: if we have a very large number of
collections, it violates our RAM budget, so we need to change our DB
structure to accomodate this need (#5058).

refs #1848
warner added a commit that referenced this issue Apr 12, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

The collectionManager was changed to keep an in-RAM set of the vrefs
for all collections, both virtual and durable. We need the virtuals to
implement `deleteAllVirtualCollections` because there's no efficient
way to enumerate them from the vatstore entries, and the code is a lot
simpler if I just track all of them. We also need the Set to tolerate
duplicate deletion attempts: `deleteAllVirtualCollections` runs first,
but just afterwards a `bringOutYourDead` might notice a zero refcount
on a virtual collection and attempt to delete it a second time. We
cannot keep this Set in RAM: if we have a very large number of
collections, it violates our RAM budget, so we need to change our DB
structure to accomodate this need (#5058).

refs #1848
warner added a commit that referenced this issue Apr 12, 2022
This deletes most non-durable data during upgrade. stopVat() delegates
to a new function `releaseOldState()`, which makes an incomplete
effort to drop everything.

The portions which are complete are:

* find all locally-decided promises and rejects them
* find all exported Remotables and virtual objects, and abandons them
* simulate finalizers for all in-RAM Presences and Representatives
* use collectionManager to delete all virtual collections
* perform a bringOutYourDead to clean up resulting dead references

After that, `deleteVirtualObjectsWithoutDecref` walks the vatstore and
deletes the data from all virtual objects, without attempting to
decref the things they pointed to. This fails to release durables and
imports which were referenced by those virtual objects (e.g. cycles
that escaped the earlier purge).

Code is written, but not yet complete, to decref those objects
properly. A later update to this file will activate that (and update
the tests to confirm it works).

The new unit test constructs a large object graph and examines it
afterwards to make sure everything was deleted appropriately. The test
knows about the limitations of `deleteVirtualObjectsWithoutDecref`, as
well as bug #5053 which causes some other objects to be retained
incorrectly.

The collectionManager was changed to keep an in-RAM set of the vrefs
for all collections, both virtual and durable. We need the virtuals to
implement `deleteAllVirtualCollections` because there's no efficient
way to enumerate them from the vatstore entries, and the code is a lot
simpler if I just track all of them. We also need the Set to tolerate
duplicate deletion attempts: `deleteAllVirtualCollections` runs first,
but just afterwards a `bringOutYourDead` might notice a zero refcount
on a virtual collection and attempt to delete it a second time. We
cannot keep this Set in RAM: if we have a very large number of
collections, it violates our RAM budget, so we need to change our DB
structure to accomodate this need (#5058).

refs #1848
@warner
Copy link
Member Author

warner commented Apr 13, 2022

You can then find the next collection by resuming iteration from vc.428..

That's what I meant by the "horrible hack". The lexicographic ordering is weird but doable. The hackish part is that we aren't using vatstoreGetAfter to iterate through a dense cluster of keys (which is what it tries to optimize for), rather we have to interrupt the iteration after getting just a single result, and the start again from some different prefix so we skip over all the other collection entry keys in the middle. The loop that's finding collections is not the same loop that's deleting their keys, since we're delegate that inner loop to a clearInternal sort of function. Our loop wants to get exactly one key from each collection, doesn't matter which, just to learn the collectionID. The key it read might or might not get deleted during clearInternal (it might be a metadata key).

But yeah, I concur that we could figure out the list of collectionIDs by walking the existing keys. And also that we need a new key to get from the collectionID to the KindID/durability info. As we discussed offline, since we need a new key anyways, we could add it to a different section of the keyspace and enable this less hackish enumeration I'm thinking of.

@warner warner added the liveslots requires vat-upgrade to deploy changes label Jan 24, 2023
warner added a commit that referenced this issue Apr 5, 2023
This removes `deleteAllVirtualCollections`, and the in-RAM
`allCollectionObjIDs` Set of all collectionIDs which supported
it. Once upon a time, `stopVat` called `deleteAllVirtualCollections`
to delete the vatstore data for all non-durable virtual
collections. We removed responsibility for deleting virtual data from
the vat (and stopped calling `stopVat` altogether) in the interests of
reducing upgrade risk, and improving upgrade performance, despite the
fact that this will cause a DB storage leak during upgrade (we figure
we can find a way to fix that later, and disk is cheap).

Now that `deleteAllVirtualCollections` is gone, we don't need to spend
the RAM on each collection, which violated our high-cardinality design
rules anyways.

refs #5058
mergify bot pushed a commit that referenced this issue Apr 11, 2023
This removes `deleteAllVirtualCollections`, and the in-RAM
`allCollectionObjIDs` Set of all collectionIDs which supported
it. Once upon a time, `stopVat` called `deleteAllVirtualCollections`
to delete the vatstore data for all non-durable virtual
collections. We removed responsibility for deleting virtual data from
the vat (and stopped calling `stopVat` altogether) in the interests of
reducing upgrade risk, and improving upgrade performance, despite the
fact that this will cause a DB storage leak during upgrade (we figure
we can find a way to fix that later, and disk is cheap).

Now that `deleteAllVirtualCollections` is gone, we don't need to spend
the RAM on each collection, which violated our high-cardinality design
rules anyways.

refs #5058
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request liveslots requires vat-upgrade to deploy changes SwingSet package: SwingSet
Projects
None yet
Development

No branches or pull requests

3 participants