Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

virtual collections: range queries, sort options, indices #2004

Closed
warner opened this issue Nov 10, 2020 · 9 comments
Closed

virtual collections: range queries, sort options, indices #2004

warner opened this issue Nov 10, 2020 · 9 comments
Assignees
Labels
enhancement New feature or request SwingSet package: SwingSet
Milestone

Comments

@warner
Copy link
Member

warner commented Nov 10, 2020

What is the Problem Being Solved?

In #1831 (comment) we made a plan to implement "range queries" and other collection-oriented things around virtual objects.

Description of the Design

Security Considerations

Test Plan

Potential Use Cases

ERTP proper

issuer.js had used a Set to protected against aliased payments being combined. A WeakStore with values undefined can be used instead, but a Set that understands virtual objects would be ideal.

Auction contract

In the Second Price Auction contract, we store the seats of the all the bidders until the end of the contract, at which time we find the highest and second highest bidder.

Barter Exchange

In the Barter Exchange contract, we store unmatched seats until there is a matching seat to trade with.

Simple Exchange

In the Simple Exchange contract, we store the unmatched seats as buyOrders or sellOrders until there is a matching seat to trade with

@warner warner added enhancement New feature or request SwingSet package: SwingSet labels Nov 10, 2020
@erights erights self-assigned this Feb 19, 2021
@warner
Copy link
Member Author

warner commented Feb 19, 2021

@FUDCo and I were talking about the MongoDB API and how it might apply to this. Their database is a collection of "documents" (JSON blobs), and the query language is a matching-predicate / sorting-comparison function encoded into a JSON-shaped object. There is some correspondence between the property hierarchy of the match/sort object and the properties being matched on the document, and the values in the match/sort object can express various equality/inequality requirements and ordering. Encoding this in an object limits the scope of the language compared to SQL (or CouchDB, which lets you write arbitrarily-complex map/reduce programs in a confined form of JS), which probably reduces the burdens on the backend.

@erights
Copy link
Member

erights commented Feb 19, 2021

@FUDCo

Encoding this in an object limits the scope of the language compared to ...

I would like to understand that as a limited language. Do you have a link?

@FUDCo
Copy link
Contributor

FUDCo commented Feb 19, 2021

Encoding this in an object limits the scope of the language compared to ...

I would like to understand that as a limited language. Do you have a link?

I'm not clear what @warner means by this. However, aside from going into the full exercise of trying to understand/explain all of MongoDB, I think a relevant starting point might be https://docs.mongodb.com/manual/tutorial/query-documents/

@warner
Copy link
Member Author

warner commented Feb 19, 2021

That looks like a good document.. I see an example of:

const cursor = db.collection('inventory').find({
  $or: [{ status: 'A' }, { qty: { $lt: 30 } }]
});

where they're using properties with special names ($or, $lt) to encode the "operators" of the query language.

Hm, now I wonder what it would look like if we just dropped in a special pseudo-vat that ran MongoDB, with a serialization mechanism that allowed kernel object references in the values, and had the client vats access it through regular messages.

@katelynsills
Copy link
Contributor

Added another use case to the issue description. issuer.js had used a Set to protect against the same payment being included twice:

if (payments.length > 1) {
const paymentSet = new Set();
payments.forEach(payment => {
if (paymentSet.has(payment)) {
throw new Error('same payment seen twice');
}
paymentSet.add(payment);
});
A set that understands virtual objects would be nice to have.

@FUDCo
Copy link
Contributor

FUDCo commented Dec 2, 2021

As implementation has progressed, it has reached the point where it's possible to construct a rough road map (and one that should become increasingly credible under maintenance as we go) from the present state of the code to something able to satisfy our MVP requirements (which are basically (1) support for high-cardinality collections (i.e., collections too big to fit in RAM) with sufficient query facilities to satisfy the needs of the AMM and treasury and (2) externalized data storage that can survive having its vat terminated and then resurrected with new code, to support contract code upgrade and data schema evolution).

Here is my crude plan as it currently stands:

  • Augment key schema validation logic to understand complex schemas (e.g., M.or(M.string(), M.number()))
  • Implement the various store types (e.g., Map vs. Set, Ephemeral vs. Persistent vs. Durable, Weak vs. Strong keys, etc.) on top of the underlying generic collection mechanism
  • Remove makeCollection from the vat global and replace it with the various store type makers.
  • Revise tests from using makeCollection to using the individual store types, and enhance tests to include store type-specific feature tests
  • Clean up residual crud leftover from MarkM's first pass at rank ordering composite keys
  • Change remotable keys so collection iteration reflects insertion order of remotables rather than vref order
  • Implement refcounting for remotables used as keys or stored inside values
  • Refactor common vref GC/refcount code so it can be shared by both collections and virtual objects
  • Implement GC for weak stores (basically, apply the already implemented weak collection GC machinery in virtualObjectManager.js to stores as well)
  • API for distinguishing ephemeral vs. virtual vs. durable
  • Implement exclusion of non-durable references from durable collections
    imports
  • Reify stores themselves as first class remotables so that references to them can be stored in collections, passed in messages, etc.
  • Replace legacy makeVirtualScalarWeakMap with makeScalarWeakBigMapStore in the various places that used it
  • Remove the various vat store makers and related identifiers from the vat global and replace with compartmented imports (or whatever the mechanism is that we finally settle on for making these APIs available to vat code) and update consuming code to match
  • Support vatStore syscalls (or quite possibly syscalls in general) during buildRootObject (tolerate vatstore syscalls during vat startup #2910)
  • Reify virtual object kinds so that kinds can be stored and so that a kind's data can be associated with different behavior at different times (this in support of upgrade) (decide+implement API for durable virtual objects/collections: makeDurableKind? #4495)
  • Implement the "baggage" mechanism to fully support termination, upgrade, and restart of a vat ("baggage": prepare for vat upgrade, deliver durable objects to successor  #4325)

(warner): We'll push these two out to a later ticket:

  • Possible optimization: replace covers with lists of covers, to reduce excessive database entry scanning in support of complex queries
  • Add support for composite keys (i.e., arrays, records, maps, and sets used as keys). This will most likely require migrating from the current string-keyed key-value database to a more structured database such as sqlite, along with a concomitant (and likely massive) code overhaul to replace the current cover-based query mechanism with something more sophisticated that makes direct use of the underlying database affordances. When we get to this point, this task will almost certainly unpack into a whole new project plan of its own, with its own issue or issues to track it, but I'm noting it here for completeness and continuity.

@warner
Copy link
Member Author

warner commented Jan 19, 2022

Let's limit this ticket to getting the virtual/durable collection API landed. I've added #4325 to cover the "baggage" mechanism, and we have #3062 for the Kind-reassociation API.

Before we close this ticket, make (and link to) a new one for the last two bullet points: "possible optimizations" and "support for composite keys".

@warner warner added this to the Mainnet: Phase 1 - RUN Protocol milestone Jan 19, 2022
@Tartuffo Tartuffo added MN-1 and removed MN-1 labels Feb 2, 2022
@Tartuffo Tartuffo removed this from the Mainnet: Phase 1 - RUN Protocol milestone Feb 8, 2022
@Tartuffo Tartuffo added this to the Mainnet 1 milestone Mar 23, 2022
@Tartuffo Tartuffo modified the milestones: Mainnet 1, RUN Protocol RC0 Apr 5, 2022
@Tartuffo Tartuffo added this to the Mainnet 1 milestone Apr 6, 2022
@FUDCo
Copy link
Contributor

FUDCo commented Apr 26, 2022

@warner Given that the various subtasks we were tracking here have been closed (aside from a couple that explicitly flagged as for future issues to be filed), can we mark this as closed?

@FUDCo
Copy link
Contributor

FUDCo commented Apr 29, 2022

Closing, as the sub-tasks are basically all done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SwingSet package: SwingSet
Projects
None yet
Development

No branches or pull requests

5 participants