Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase swingset DEFAULT_VIRTUAL_OBJECT_CACHE_SIZE #2644

Open
warner opened this issue Mar 15, 2021 · 6 comments
Open

increase swingset DEFAULT_VIRTUAL_OBJECT_CACHE_SIZE #2644

warner opened this issue Mar 15, 2021 · 6 comments
Assignees
Labels
enhancement New feature or request SwingSet package: SwingSet vaults_triage DO NOT USE

Comments

@warner
Copy link
Member

warner commented Mar 15, 2021

What is the Problem Being Solved?

To shake out bugs, liveSlots.js currently defines the virtual object cache size to 3:

const DEFAULT_VIRTUAL_OBJECT_CACHE_SIZE = 3; // XXX ridiculously small value to force churn for testing

At some point we need to raise that.

When?

We need to exercise this feature enough to feel like we've had a chance to see any lingering bugs, like with #2526.

We want to increase this some time before mainnet makes things harder to upgrade.

To What?

How much RAM should we dedicate to the virtual object cache?

We should figure out how to run some tests with a large (1M) number of Purses, set the cache size to various values, and see how that affects the RAM usage, and also performance. We should measure the number of DB reads/writes, and the wallclock time it takes to run some Purse-interacting benchmark.

My hunch is that a cache size of somewhere between 1k and 10k is plausible, but I'd rather trust the data than my instincts.

cc @FUDCo

@warner
Copy link
Member Author

warner commented Jan 26, 2022

To make this decision, we need:

  • store Zoe state in durable/virtual objects/collections #4383 (zoe uses virtual objects)
  • ? (contracts use virtual objects)
  • then we need performance data from a chain running some meaningful amount of load, specifically:
    • how many syscalls were made by zoe+contract vats
    • the RSS (heap) size for those vats
    • overall wallclock time spent by those vats
    • overall CPU time consumed by those vats, or the chain overall

If the cache size is too low, I'd expect to see a lot of syscalls (paging the vobjs in and out), increased wallclock time (syscalls use blocking RPC over the kernel-worker pipe), and maybe increased CPU time. If the cache size is too high, I'd expect to see the RSS (heap) size of the zoe+contract vats become large, and we have a hard limit that we need to keep well below.

Also note that DEFAULT_VIRTUAL_OBJECT_CACHE_SIZE counts objects, but doesn't know how much data is associated with each one. If a particular vat uses very large objects, we'd want a smaller cache size, but I don't think that's very likely. I think we're more likely to see a very large number of small objects.

@Tartuffo Tartuffo removed the MN-1 label Feb 7, 2022
@warner
Copy link
Member Author

warner commented Feb 9, 2022

I'll assign this to @FUDCo because he wrote the cache, but it will take a village to get the data he needs to make a decision.

@Tartuffo Tartuffo added this to the Mainnet 1 milestone Mar 23, 2022
@Tartuffo Tartuffo modified the milestones: Mainnet 1, RUN Protocol RC0 Apr 5, 2022
@Tartuffo Tartuffo assigned mhofman and arirubinstein and unassigned FUDCo Apr 14, 2022
@Tartuffo
Copy link
Contributor

@warner @mhofman @arirubinstein should this be done in loadgen, or in @arirubinstein's test bed?

@mhofman
Copy link
Member

mhofman commented Apr 14, 2022

We could probably run the a given revision through the loadgen, vary the cache size and observe impact on performance & memory consumption. Main issue is visualizing this in the case of the loadgen, as that's still subpar.

@warner
Copy link
Member Author

warner commented Aug 19, 2022

We like having a small value for testing, because problems happen faster and it's easier to correlate userspace behavior with syscalls. For the BOYD frequency, we left swingset's default at 1, with an expectation that host applications will set it to something sensible. We can do the same here: swingset retains the default at 1 or 0 or whatever is best for development purposes, but cosmic-swingset sets it to something larger.

So the task remains to choose a number for cosmic-swingset to use, but the change will be made in cosmic-swingset, not in swingset.

@Tartuffo Tartuffo removed this from the Mainnet 1 RC0 milestone Sep 21, 2022
@ivanlei ivanlei added the vaults_triage DO NOT USE label Jan 17, 2023
@warner
Copy link
Member Author

warner commented Jan 24, 2023

Note that if #6693 removes the cache entirely, or switches to holding everything until end-of-crank (and then flushing at end-of-crank), this ticket becomes obsolete.

@ivanlei ivanlei removed the MUST-HAVE label Mar 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SwingSet package: SwingSet vaults_triage DO NOT USE
Projects
None yet
Development

No branches or pull requests

6 participants