Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: de-flake TestProtectedTimestamps #75436

Merged

Commits on Jan 24, 2022

  1. kvserver: de-flake TestProtectedTimestamps

    Fixed cockroachdb#75020. This test makes use of an "upsert-until-backpressure"
    primitive, where it continually writes blobs to a scratch table
    configured with a lower max range size (1<<18 from the default of
    512<<20) until it experiences replica backpressure. Before cockroachdb#73876 (when
    using the system config span), the splitting off of the scratch table's
    ranges happened near instantly after the creation of the table itself.
    This changed slightly with the span configs infrastructure, where
    there's more of asynchrony built in and ranges may appear after a while
    longer.
    
    The test previously started upserting as soon as it created the table,
    being able to implicitly rely on the tight synchrony of already having
    the table's ranges carved out. This comes when deciding to record a
    replica's largest previous max range size -- something we only do if the
    replica's current size is larger than the new limit (see
    (*Replica).SetSpanCnfig). When we weren't upserting until the range
    applied the latest config, this was not possible. With span configs and
    its additional asynchrony, this assumption no longer held. It was
    possible then for the range to accept writes larger than the newest max
    range size, which unintentionally (for this test at least) triggers an
    escape hatch where we avoid backpressure when lowering a range's max
    size below its current size (cockroachdb#46863).
    
    The failure is easy to repro. This test runs in ~8s, and with the
    following we were reliably seeing timeouts where the test was waiting
    for backpressure to kick in (but it never did because of the escape
    hatch).
    
        dev test pkg/kv/kvserver \
          -f TestProtectedTimestamps -v --show-logs \
          --timeout 300s --ignore-cache --count 10
    
    De-flaking this test is simple -- we just wait for the table's replicas
    to apply the latest config before issuing writes to it.
    
    Release note: None
    irfansharif committed Jan 24, 2022
    Configuration menu
    Copy the full SHA
    20161f9 View commit details
    Browse the repository at this point in the history