Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise concurrency for scale #905

Closed
zaneb opened this issue Jun 1, 2021 · 1 comment · Fixed by #906
Closed

Optimise concurrency for scale #905

zaneb opened this issue Jun 1, 2021 · 1 comment · Fixed by #906
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@zaneb
Copy link
Member

zaneb commented Jun 1, 2021

Currently the default value for the $BMO_CONCURRENCY variable is 3. We chose this value because (a) it wasn't 1 (so in testing we would not end up relying on only 1 BMH being reconciled at a time), and (b) it seemed like as good a number as any. With the benefit of more data, we can say a bit more about the optimal value.

We currently limit the number of nodes that can be provisioned at one time ($PROVISIONING_LIMIT) to 20 to limit the load on Ironic. Note that enforcement of the provisioning limit cannot be made thread-safe, so while it isn't a hard limit, we should endeavour to keep $BMO_CONCURRENCY << $PROVISIONING_LIMIT to avoid smashing right through it. Most of the ironic provisioning work happens asynchronously from the perspective of the baremetal-operator; we only schedule a reconcile to check on it every 10s.

However, before a new Host can start provisioning, it must pass through a number of states, including those that don't do ~anything (e.g. match profile, or inspecting when inspection is disabled). If a large number of Hosts are added simultaneously, they will have a tendency to go through this process in lockstep, since as each is advanced it will go to the back of the queue before it will get processed again. Ideally we want at least 20 Hosts to be available for provisioning as quickly as possible.

I wrote a little simulation to experiment with different settings. If we assume 1000 new Hosts (why not) added at once and 20 reconcile steps (wild guess) each taking around 100ms (pretty accurate when there is no work to do), then it would take more than 10 minutes for the first 100 nodes to become available for provisioning. Real-world testing with 1000 nodes created simultaneously shows that indeed, very few hosts are provisioned in the first 10-20 minutes. This can be improved by increasing the number of concurrent reconciles allowed.

Another option is to add some random small delay when requeuing, instead of requeuing without delay. As each reconcile step is worked through, the Hosts should tend to spread out in time, so that some reach the provisioning state sooner than others. Using the same simulation, the optimal amount of jitter to add is:

1.9 * <Total number of hosts> * <Average reconcile time> / <Number of concurrent reconciles>

(I haven't determined the mathematical significance of the value 1.9, but it's accurate to 2s.f.)

Plotting the effect of both interventions:

thread_time

Adding jitter reduces the time for the first 100 hosts to go through 20 reconcile steps by about 10-15%, but increasing the concurrency has a much, much larger effect. A default value of 8 for $BMO_CONCURRENCY would seem to capture most of the returns at this scale.

zaneb added a commit to zaneb/baremetal-operator that referenced this issue Jun 1, 2021
Instead of defaulting the $BMO_CONCURRENCY value (the maximum number of
Hosts to reconcile concurrently) to a hard-coded value of 3, set it
instead to the number of CPU threads available, but constrained to a
range between 2 and 8.

We never default to 1 so that we don't inadvertantly rely on
single-threadedness in tests. 8 seems to be a reasonable value for a
large scale deployment, while still being substantially below the
default $PROVISIONING_LIMIT of 20.

There remains no restriction on the value that can be passed in the
environment variable.

Fixes metal3-io#905
@furkatgofurov7
Copy link
Member

/kind feature

@metal3-io-bot metal3-io-bot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 2, 2021
zaneb added a commit to zaneb/baremetal-operator that referenced this issue Jun 7, 2021
Instead of defaulting the $BMO_CONCURRENCY value (the maximum number of
Hosts to reconcile concurrently) to a hard-coded value of 3, set it
instead to the number of CPU threads available, but constrained to a
range between 2 and 8.

We never default to 1 so that we don't inadvertantly rely on
single-threadedness in tests. 8 seems to be a reasonable value for a
large scale deployment, while still being substantially below the
default $PROVISIONING_LIMIT of 20.

There remains no restriction on the value that can be passed in the
environment variable.

Fixes metal3-io#905
honza pushed a commit to honza/baremetal-operator that referenced this issue Jun 9, 2021
Instead of defaulting the $BMO_CONCURRENCY value (the maximum number of
Hosts to reconcile concurrently) to a hard-coded value of 3, set it
instead to the number of CPU threads available, but constrained to a
range between 2 and 8.

We never default to 1 so that we don't inadvertantly rely on
single-threadedness in tests. 8 seems to be a reasonable value for a
large scale deployment, while still being substantially below the
default $PROVISIONING_LIMIT of 20.

There remains no restriction on the value that can be passed in the
environment variable.

Fixes metal3-io#905

(cherry picked from commit febccb3)
Signed-off-by: Honza Pokorny <[email protected]>
levsha pushed a commit to levsha/baremetal-operator that referenced this issue Sep 1, 2021
Instead of defaulting the $BMO_CONCURRENCY value (the maximum number of
Hosts to reconcile concurrently) to a hard-coded value of 3, set it
instead to the number of CPU threads available, but constrained to a
range between 2 and 8.

We never default to 1 so that we don't inadvertantly rely on
single-threadedness in tests. 8 seems to be a reasonable value for a
large scale deployment, while still being substantially below the
default $PROVISIONING_LIMIT of 20.

There remains no restriction on the value that can be passed in the
environment variable.

Fixes metal3-io#905
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants