Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: use Background() in computeChecksumPostApply goroutine #75448

Merged
merged 1 commit into from
Jan 25, 2022

Conversation

tbg
Copy link
Member

@tbg tbg commented Jan 24, 2022

On the leaseholder, ctx passed to computeChecksumPostApply is that
of the proposal. As of #71806, this context is canceled right after the
corresponding proposal is signaled (and the client goroutine returns
from sendWithRangeID). This effectively prevents most consistency
checks from succeeding (they previously were not affected by
higher-level cancellation because the consistency check is triggered
from a local queue that talks directly to the replica, i.e. had
something like a minutes-long timeout).

This caused disastrous behavior in the clearrange suite of roachtests.
That test imports a large table. After the import, most ranges have
estimates (due to the ctx cancellation preventing the consistency
checks, which as a byproduct trigger stats adjustments) and their stats
claim that they are very small. Before recent PR #74674, ClearRange on
such ranges would use individual point deletions instead of the much
more efficient pebble range deletions, effectively writing a lot of data
and running the nodes out of disk.

Failures of clearrange with #74674 were also observed, but they did
not involve out-of-disk situations, so are possibly an alternative
failure mode (that may still be related to the newly introduced presence
of context cancellation).

Touches #68303.

Release note: None

@tbg tbg requested a review from a team as a code owner January 24, 2022 16:46
On the leaseholder, `ctx` passed to `computeChecksumPostApply` is that
of the proposal. As of cockroachdb#71806, this context is canceled right after the
corresponding proposal is signaled (and the client goroutine returns
from `sendWithRangeID`). This effectively prevents most consistency
checks from succeeding (they previously were not affected by
higher-level cancellation because the consistency check is triggered
from a local queue that talks directly to the replica, i.e. had
something like a minutes-long timeout).

This caused disastrous behavior in the `clearrange` suite of roachtests.
That test imports a large table. After the import, most ranges have
estimates (due to the ctx cancellation preventing the consistency
checks, which as a byproduct trigger stats adjustments) and their stats
claim that they are very small. Before recent PR cockroachdb#74674, `ClearRange` on
such ranges would use individual point deletions instead of the much
more efficient pebble range deletions, effectively writing a lot of data
and running the nodes out of disk.

Failures of `clearrange` with cockroachdb#74674 were also observed, but they did
not involve out-of-disk situations, so are possibly an alternative
failure mode (that may still be related to the newly introduced presence
of context cancellation).

Touches cockroachdb#68303.

Release note: None
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg
Copy link
Member Author

tbg commented Jan 24, 2022

bors r=erikgrinaker
bors single on

TFTR!

@craig
Copy link
Contributor

craig bot commented Jan 24, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jan 25, 2022

Build failed:

@tbg
Copy link
Member Author

tbg commented Jan 25, 2022

bors r=erikgrinaker

@craig
Copy link
Contributor

craig bot commented Jan 25, 2022

Build succeeded:

@craig craig bot merged commit 8eaf8d2 into cockroachdb:master Jan 25, 2022
tbg added a commit to tbg/cockroach that referenced this pull request Feb 22, 2022
We've seen in the events leading up to cockroachdb#75448 that a build-up of
consistency check computations on a node can severely impact node
performance. This commit attempts to address the main source of
that, while re-working the code for easier maintainability.

The way the consistency checker works is by replicating a command through
Raft that, on each Replica, triggers an async checksum computations
the results of which the caller collects via `CollectChecksum` requests
addressed to each `Replica`.

If for any reason, the caller does *not* wait to collect the checksums
but instead moves on to run another consistency check (perhaps on
another Range), these inflight computations can build up over time.

This was the main issue in cockroachdb#75448; we were accidentally canceling the
context on the leaseholder "right away", failing the consistency check
(but leaving it running on all other replicas), and moving on to the
next Range.
As a result, some (but with spread out leaseholders, ultimately all)
Replicas ended up with dozens of consistency check computations,
starving I/O and CPU.  We "addressed" this by avoiding this errant ctx
cancellation (cockroachdb#75448 but longer-term cockroachdb#75656), but this isn't a holistic
fix yet.

In this commit, we make three main changes:

- give the inflight consistency check computations a clean API, which
  makes it much easier to understand "how it works".
- when returning from CollectChecksum (either on success or error,
  notably including context cancellation), cancel the corresponding
  consistency check. This solves the problem, *assuming* that
  CollectChecksum is reliably issued to each Replica.
- reliably issue CollectChecksum to each Replica on which a computation
  may have been triggered. When the caller's context is canceled, still
  do the call with a one-second-timeout one-off Context which should be
  good enough to make it to the Replica and short-circuit the call.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants