-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance impact of BOYD #4160
Comments
Maybe a good ticket for our new hire. |
FYI @arirubinstein |
We could probably get a lot of benefit from some tuning of parameters, probably worth doing that for MN-1, and further fine tuning controls for MN-1.1. |
@FUDCo and I figure that we should set this to like 1 BOYD per 1000 deliveries for now. The risk is allowing one vat (which doesn't get constant traffic) to hold onto objects that keep other vats from dropping things. Chip points out that this is only really a problem if it impedes operations. We think we could build a follower node that periodically makes a separate copy of the kernel state, then detaches that copy from the chain, and sends a BOYD to all vats, and measures the change in the kernel object table size. This would let us keep track of how much garbage is "floating" on the chain, waiting for a BOYD, and that might tell us 1: it's not that much, we don't need to worry about it, or 2: we should do something at the next convenient point of upgrade. |
While we were debugging performance problems last year, we introduced `managerType: 'xs-worker-no-gc'`, which is a variant of `xs-worker` that disables the forced GC we were doing at the end of every delivery. Since then, we've improved performance in several ways: * XS bugs were retaining objects, which have since been fixed * we only do a forced GC during special `dispatch.bringOutYourDead` cranks * we deliver BOYD less often (according to a configurable schedule, see #4160 for our decisions, but we're thinking once every 100 to 1000 deliveries) The `xs-worker-no-gc` manager type controlled a flag named `gcEveryCrank`, which replaced the GC primitive available to liveslots with a dummy version. This commit removes both `xs-worker-no-gc` and `gcEveryCrank`. closes #5600
While we were debugging performance problems last year, we introduced `managerType: 'xs-worker-no-gc'`, which is a variant of `xs-worker` that disables the forced GC we were doing at the end of every delivery. Since then, we've improved performance in several ways: * XS bugs were retaining objects, which have since been fixed * we only do a forced GC during special `dispatch.bringOutYourDead` cranks * we deliver BOYD less often (according to a configurable schedule, see #4160 for our decisions, but we're thinking once every 100 to 1000 deliveries) The `xs-worker-no-gc` manager type controlled a flag named `gcEveryCrank`, which replaced the GC primitive available to liveslots with a dummy version. This commit removes both `xs-worker-no-gc` and `gcEveryCrank`. closes #5600
While we were debugging performance problems last year, we introduced `managerType: 'xs-worker-no-gc'`, which is a variant of `xs-worker` that disables the forced GC we were doing at the end of every delivery. Since then, we've improved performance in several ways: * XS bugs were retaining objects, which have since been fixed * we only do a forced GC during special `dispatch.bringOutYourDead` cranks * we deliver BOYD less often (according to a configurable schedule, see #4160 for our decisions, but we're thinking once every 100 to 1000 deliveries) The `xs-worker-no-gc` manager type controlled a flag named `gcEveryCrank`, which replaced the GC primitive available to liveslots with a dummy version. This commit removes both `xs-worker-no-gc` and `gcEveryCrank`. closes #5600
While we were debugging performance problems last year, we introduced `managerType: 'xs-worker-no-gc'`, which is a variant of `xs-worker` that disables the forced GC we were doing at the end of every delivery. Since then, we've improved performance in several ways: * XS bugs were retaining objects, which have since been fixed * we only do a forced GC during special `dispatch.bringOutYourDead` cranks * we deliver BOYD less often (according to a configurable schedule, see #4160 for our decisions, but we're thinking once every 100 to 1000 deliveries) The `xs-worker-no-gc` manager type controlled a flag named `gcEveryCrank`, which replaced the GC primitive available to liveslots with a dummy version. This commit removes both `xs-worker-no-gc` and `gcEveryCrank`. closes #5600
While we were debugging performance problems last year, we introduced `managerType: 'xs-worker-no-gc'`, which is a variant of `xs-worker` that disables the forced GC we were doing at the end of every delivery. Since then, we've improved performance in several ways: * XS bugs were retaining objects, which have since been fixed * we only do a forced GC during special `dispatch.bringOutYourDead` cranks * we deliver BOYD less often (according to a configurable schedule, see #4160 for our decisions, but we're thinking once every 100 to 1000 deliveries) The `xs-worker-no-gc` manager type controlled a flag named `gcEveryCrank`, which replaced the GC primitive available to liveslots with a dummy version. This commit removes both `xs-worker-no-gc` and `gcEveryCrank`. closes #5600
While we were debugging performance problems last year, we introduced `managerType: 'xs-worker-no-gc'`, which is a variant of `xs-worker` that disables the forced GC we were doing at the end of every delivery. Since then, we've improved performance in several ways: * XS bugs were retaining objects, which have since been fixed * we only do a forced GC during special `dispatch.bringOutYourDead` cranks * we deliver BOYD less often (according to a configurable schedule, see #4160 for our decisions, but we're thinking once every 100 to 1000 deliveries) The `xs-worker-no-gc` manager type controlled a flag named `gcEveryCrank`, which replaced the GC primitive available to liveslots with a dummy version. This commit removes both `xs-worker-no-gc` and `gcEveryCrank`. closes #5600
While we were debugging performance problems last year, we introduced `managerType: 'xs-worker-no-gc'`, which is a variant of `xs-worker` that disables the forced GC we were doing at the end of every delivery. Since then, we've improved performance in several ways: * XS bugs were retaining objects, which have since been fixed * we only do a forced GC during special `dispatch.bringOutYourDead` cranks * we deliver BOYD less often (according to a configurable schedule, see #4160 for our decisions, but we're thinking once every 100 to 1000 deliveries) The `xs-worker-no-gc` manager type controlled a flag named `gcEveryCrank`, which replaced the GC primitive available to liveslots with a dummy version. This commit removes both `xs-worker-no-gc` and `gcEveryCrank`. closes #5600
We still need to choose the number and test it out. |
@arirubinstein has done some benchmarking and has a proposed number based on that benchmarking. There are two places it can get set - default value in the code itself (currently set to 1), and the config file for the swingset you are actually running. We need to decide where to make the change. My preference is to change the default. |
@FUDCo Brian and I just discussed, can you please start working on this? |
Describe the bug
The bring out your dead PR (#3801) caused a jump in swingset time. There also seem to be more time spent in swingset during a block over the life of the chain, which is usually indicative of a memory leak.
To Reproduce
Steps to reproduce the behavior:
--stage.duration=10 --stage.loadgen.vault.interval=12 --stage.loadgen.amm.interval=12 --stage.loadgen.amm.wait=6 --stage.loadgen.vault.limit=10 --stage.loadgen.amm.limit=10
Expected behavior
Not such a significant increase in swingset time. No further increase over the life of the chain
Platform Environment
9a47516a349185409d3c320b585312178c8a08b8
Additional context
The PR doesn't change the GC schedule so the impact is likely from the different accounting logic that BYOD introduces.
Screenshots
NB: Stage 4 is missing due to an unrelated chain stall on restart
The text was updated successfully, but these errors were encountered: