kvprober: implement "shadow write" probes #67112

joshimhoff · 2021-07-01T14:08:55Z

Is your feature request related to a problem? Please describe.
We have a kvprober that sends point read requests to "random" ranges. We should extent that prober to test the availability of a range at a write level. We can call this a "shadow write".

Describe the solution you'd like
Strawman proposal:

Implement a raft command called Probe / ShadowWrite and make available via the kvclient public API.
The MVP implementation of the command does nothing.
Extend kvprober to make Probe / ShadowWrite requests to "random" ranges.

The test of kv is decent. The Probe / ShadowWrite command needs to get proposed, agreed upon, applied, etc. (Am I using these words, correctly?) A write to the raft log will happen, so availability of the disk is checked.

The test of pebble is minimal, as no actual write happens at Probe command apply time. Note though that we could change this in future CRDB versions. One can imagine writing to pebble but in a way that doesn't lead to user-visible side effects, in order to improve the realistic of the probe (in order to match the actual CRDB write codepath more closely).

CC @tbg @andreimatei @knz @bdarnell @jreut @logston for review of the strawman proposal. I hope for a naming bikeshed.

Also, KV folks: How hard of a time do you think I will have implementing this? It's hard for me to scope the add Probe / ShadowWrite command part. My sense from talking with Ben a while back is that it's not technically hard really but lots of boilerplate and also a new command hasn't been added in a while so may be tricky to figure out all the places to make changes.

Describe alternatives you've considered

We should also implement the stuck applied index + failing probe alert, which has faster mean time to detect, so long as the symptom experienced is a stuck applied index. Can't a link to an issue for that but it has been discussed.
We should consider other similar approaches to above, where some internal detail that is suspect (a stuck applied index) leads us to probe a specific range, leading to faster mean time to detect.

These aren't really alternatives tho. Blackbox approaches like this one are complimented by whitebox approaches.

Additional context
#61074
https://github.com/cockroachdb/cockroach/blob/master/pkg/kv/kvprober/kvprober.go

Epic CC-4054

The text was updated successfully, but these errors were encountered:

tbg · 2021-07-01T15:08:31Z

Can't a link to an issue for that but it has been discussed.

Here you go:

#61118
#33007

Also, KV folks: How hard of a time do you think I will have implementing this? It's hard for me to scope the add Probe / ShadowWrite command part. My sense from talking with Ben a while back is that it's not technically hard really but lots of boilerplate and also a new command hasn't been added in a while so may be tricky to figure out all the places to make changes.

Just some boilerplate, but not much in terms of technical challenges. I think we don't even need to add a new command though, it feels as though adding a new key like

cockroach/pkg/keys/keys.go

Lines 398 to 402 in a7472e3

    
           // RangeDescriptorKey returns a range-local key for the descriptor 
        
           // for the range with specified key. 
        
           func RangeDescriptorKey(key roachpb.RKey) roachpb.Key { 
        
           	return MakeRangeKey(key, LocalRangeDescriptorSuffix, nil) 
        
           }

(with suffix "prbe") would ~get the job done. It could then also be used as the key read by the read prober. You can write that key through the existing KV API, so very little plumbing needed.

joshimhoff · 2021-07-01T15:20:04Z

That would be awesome. In addition to being easier to implement, then we actually write to pebble also.

Re: "range-local", I am reading:

cockroach/pkg/keys/doc.go

Line 118 in a7472e3

// 3. (replicated) Range-ID local keys vs. Range local keys

// 3. (replicated) Range-ID local keys vs. Range local keys
//
// Deciding between replicated range-ID local keys and range local keys is not
// entirely straightforward, as the two key types serve similar purposes.
// Range-ID keys, as the name suggests, use the range-ID in the key. Range local
// keys instead use a key within the range bounds. Range-ID keys are not
// addressable whereas range-local keys are. Note that only addressable keys can
// be the target of KV operations, unaddressable keys can only be written as a
// side-effect of other KV operations. This can often makes the choice between
// the two clear (range descriptor keys needing to be addressable, and therefore
// being a range local key is one example of this). Not being addressable also
// implies not having multiple versions, and therefore never having intents.
//
// The "behavioral" difference between range local keys and range-id local keys
// is that range local keys split and merge along range boundaries while
// range-id local keys don't. We want to move as little data as possible during
// splits and merges (in fact, we don't re-write any data during splits), and
// that generally determines which data sits where. If we want the split point
// of a range to dictate where certain keys end up, then they're likely meant to
// be range local keys. If not, they're meant to be range-ID local keys. Any key
// we need to re-write during splits/merges will needs to go through Raft. We
// have limits set on the size of Raft proposals so we generally don’t want to
// be re-writing lots of data. Range lock keys (see below) are separate from
// range local keys, but behave similarly in that they split and merge along
// range boundaries.
//
// This naturally leads to range-id local keys being used to store metadata
// about a specific Range and range local keys being used to store metadata
// about specific "global" keys. Let us consider transaction record keys for
// example (ignoring for a second we also need them to be addressable). Hot
// ranges could potentially have lots of transaction keys. Keys destined for the
// RHS of the split need to be collocated with the RHS range. By categorizing
// them as as range local keys, we avoid needing to re-write them during splits
// as they automatically sort into the new range boundaries. If they were
// range-ID local keys, we'd have to update each transaction key with the new
// range ID..

Gonna need to read that a few times more...

How do you guarantee the range local key doesn't conflict with some other key, e.g. a key where we store SQL data?

joshimhoff · 2021-07-01T19:36:04Z

I found:

cockroach/pkg/keys/doc.go

Line 93 in 4d987b3

// 2. Key Addressing

Which seems to answer my question. Reading.

tbg · 2021-07-05T08:57:19Z

How do you guarantee the range local key doesn't conflict with some other key, e.g. a key where we store SQL data?

This is basically a key that is guaranteed to not have any function yet. It's in a "parallel plane" relative to the SQL data or anything else the tenant might write. (I think it isn't even allowed to).

joshimhoff · 2021-07-12T17:53:30Z

Ack. Thanks. Will take a closer look soon. This seems like a great direction.

joshimhoff · 2021-07-14T13:13:07Z

OK, so pattern matching a bit...

I add a new prefix here, maybe "prb" or "probe":

cockroach/pkg/keys/doc.go

Line 213 in 4d987b3

QueueLastProcessedKey, // "qlpt"

Then I add a function like this, maybe ProberKey:

cockroach/pkg/keys/keys.go

Line 398 in 4d987b3

// RangeDescriptorKey returns a range-local key for the descriptor

Then I call it from kvprober, passing in the start key of the range, which the Planner returns (https://github.com/cockroachdb/cockroach/blob/master/pkg/kv/kvprober/planner.go#L27), similar to here:

cockroach/pkg/cli/debug.go

Line 1099 in 4d987b3

key := keys.RangeDescriptorKey(desc.StartKey)

And I pass the returned key into methods like Get and Put on *DB, e.g. here in the existing kvprober code: https://github.com/cockroachdb/cockroach/blob/master/pkg/kv/kvprober/kvprober.go#L205

Do I have that right?

Now things get very fuzzy for me... gonna ask Qs that may be obvious to help me learn...

I presume kvclient handles calling Addr somewhere, so it gets routed to correct range (need to "strip the range local prefix")?

cockroach/pkg/keys/keys.go

Line 510 in 4d987b3

func Addr(k roachpb.Key) (roachpb.RKey, error) {

I also presume that at a storage level, the key is written with the range local prefix as per below?

~/cockroach-data$ cockroach debug keys . | grep '/Local/Range/'
0,0 /Local/Range/Min/QueueLastProcessed/"consistencyChecker": 
1622840016.688671000,0 /Local/Range/Min/RangeDescriptor: 
1622840015.379174000,0 /Local/Range/Min/RangeDescriptor: 
1622840015.091356000,0 /Local/Range/Min/RangeDescriptor: 
1622840014.578085000,0 /Local/Range/Min/RangeDescriptor: 
1622839923.579545000,0 /Local/Range/Min/RangeDescriptor: 
0,0 /Local/Range/System/NodeLiveness/QueueLastProcessed/"consistencyChecker": 
1622840004.695575000,0 /Local/Range/System/NodeLiveness/RangeDescriptor: 
1622840003.978210000,0 /Local/Range/System/NodeLiveness/RangeDescriptor: 
1622840003.757172000,0 /Local/Range/System/NodeLiveness/RangeDescriptor: 
1622840002.862404000,0 /Local/Range/System/NodeLiveness/RangeDescriptor: 
1622839923.579545000,0 /Local/Range/System/NodeLiveness/RangeDescriptor:

I am confused how range merges & splits work. A change in range start & end key means a change in the key at which a range descriptor is stored, since part of the key is the range start key, right? Do we move these KV pairs at range merge & split time somehow? Would we need to do that with the prober KV pairs, or would kvprober simply write to whatever is the current key (as of planning time (more below)) and let GC clean up the old KV pairs? That is, no need to do anything special: Just read & write the range local keys from kvprober.

Let's say there are two ranges:

A starts at key "a" and ends at key "c".
B starts at key "c" and ends at key "e".

These ranges get merged into a range A that starts at key "a" and ends at key "e".

Planning ran before the merge happened returning range B with start key "c". So we try probing ProberKey("c"), even tho "c" is no longer a start key since the ranges have been merged. Is this okay? The address of ProberKey("c") implies querying the merged range A, so no problem, right?

I guess it is also possible that the read side of the prober would read a key before it has been written by the write side of the prober, but reading an empty key is not an error, so that is okay.

tbg · 2021-07-14T13:27:08Z

👋🏽

I add a new prefix here, maybe "prb" or "probe":

four chars please, which is why I suggested prbe

Then I add a function like this, maybe ProberKey:

RangeProbeKey sounds good to me.

I presume kvclient handles calling Addr somewhere, so it gets routed to correct range (need to "strip the range local prefix")?

Yes, this is all done transparently. For example here descKey is a RangeDescriptorKey of someone:

cockroach/pkg/kv/kvserver/replica_command.go

Lines 2624 to 2648 in 8238ce0

    
           func updateRangeDescriptor( 
        
           	ctx context.Context, 
        
           	b *kv.Batch, 
        
           	descKey roachpb.Key, 
        
           	oldValue []byte, 
        
           	newDesc *roachpb.RangeDescriptor, 
        
           ) error { 
        
           	// This is subtle: []byte(nil) != interface{}(nil). A []byte(nil) refers to 
        
           	// an empty value. An interface{}(nil) refers to a non-existent value. So 
        
           	// we're careful to construct interface{}(nil)s when newDesc/oldDesc are nil. 
        
           	var newValue interface{} 
        
           	if newDesc != nil { 
        
           		if err := newDesc.Validate(); err != nil { 
        
           			return errors.Wrapf(err, "validating new descriptor %+v (old descriptor is %+v)", 
        
           				newDesc, oldValue) 
        
           		} 
        
           		newBytes, err := protoutil.Marshal(newDesc) 
        
           		if err != nil { 
        
           			return err 
        
           		} 
        
           		newValue = newBytes 
        
           	} 
        
           	b.CPut(descKey, newValue, oldValue) 
        
           	return nil 
        
           }

I am confused how range merges & splits work.

The start key of a range is immutable while the range exists (splits never change the start key, and merges extend the end key, i.e. the right hand side stops existing). So it's possible that you would probe r1, then r1 subsumes r2, but you still probe r2 (which then writes an extra key in the middle of r1 that wouldn't be anyone's responsibility to clean up). This is mildly annoying, but something you can do here is to not actually write anything! There are two options:

begin; put(key); read(key); abort
begin; put(key); del(key); commit

I think I prefer the second option. The first one has this mysterious read (which you need to block the txn on replicating the previous write; otherwise it'll sneakily not wait for the write to complete before you abort the txn, being none the wiser that it didn't go through). The second one avoids that bit of complexity and actually commits a txn, which is nice. I'd have to check whether it actually leaves a tombstone at the MVCC layer but it doesn't really matter, it definitely blocks on replication.

joshimhoff · 2021-07-14T13:42:56Z

Hi, Tobias!

This is mildly annoying, but something you can do here is to not actually write anything! There are two options:

begin; put(key); read(key); abort

begin; put(key); del(key); commit

Very fun. Yay to transactions.

So it's possible that you would probe r1, then r1 subsumes r2, but you still probe r2 (which then writes an extra key in the middle of r1 that wouldn't be anyone's responsibility to clean up).

Ack yes I think this is what I was thinking about. Would GC clean up this key eventually or does GC not operate on these range local key pairs? Not saying that is a good idea to rely on GC; more wondering if GC does operate on this part of the keyspace.

I'd have to check whether it actually leaves a tombstone at the MVCC layer but it doesn't really matter, it definitely blocks on replication.

Would be nice if a write was made to storage IMO (that is, would be nice if a tombstone was left). A bit more coverage that way? Or maybe given details not so important?

tbg · 2021-07-14T14:32:10Z

GC never cleans up live values, this value would be live so GC would never touch it (it is otherwise a regular value to GC, i.e. no special casing). It would be around "forever". So, good to avoid this scenario via the txn option.

Would be nice if a write was made to storage IMO (that is, would be nice if a tombstone was left). A bit more coverage that way? Or maybe given details not so important?

Not so important I think. What matters is a full trip through the replication layer (leaving an intent, which also does all of the writing to storage that exists), reading that back, and making it through committing the txn record. Option 2) achieves all of the above.

tbg · 2021-07-14T14:33:03Z

Oh and btw, this is also a full read probe already, i.e. unless you somehow want to read probe much more aggressively than write probe, I think this could easily be the only kind of probe and that would be totally ok.

joshimhoff · 2021-07-14T14:34:24Z

Nice! All makes sense. This is awesome. Thanks for the help as always.

joshimhoff · 2021-07-20T13:45:54Z

Oh and btw, this is also a full read probe already, i.e. unless you somehow want to read probe much more aggressively than write probe, I think this could easily be the only kind of probe and that would be totally ok.

I'm thinking more about this. My gut says it's nice to have separate measurements of read & write availability, but I am not entirely figuring out why I feel that way.

Perhaps part of it is that two probe types may provide useful data during an incident. If read is green but write is red, that means a different set of potential prod issues then if read & write are red.

Similarly, when writing postmortems, it may be nice to be able to speak about both read & write availability.

Perhaps the possibility of sending a higher rate of reads than writes per second bit is a piece of the puzzle also.

Lastly, I think we also want a scan at historical timestamp prober, as that will provide a good production vet of pebble especially. So the kvprober code already needs to be flexible enough to allow for multiple probe types.

Any thoughts, Tobias?

tbg · 2021-07-20T14:17:35Z

These are reasonable points, except this one:

Lastly, I think we also want a scan at historical timestamp prober, as that will provide a good production vet of pebble especially. So the kvprober code already needs to be flexible enough to allow for multiple probe types.

This will exercise follower reads, but has no particular coverage of pebble. To probe pebble a) you want to know what node you're talking to, which kvprober doesn't - it operates above the abstraction of "range") and b) you want to do something better than ask pebble for that one key! You'd pick a random SST from the manifest and seek somewhere into it, or something like that.

Also, re: "there will be multiple things we want to probe for so kvprober should bake in reads and writes", wouldn't it be the opposite? We don't want an ever-proliferating set of cluster settings and knobs, do we? Or at least we wouldn't want to commit to that too early.

joshimhoff · 2021-07-20T14:31:10Z

Will open a separate issue with both you & storage on it, but I'm thinking the scan prober would scan the whole range at a historical timestamp and export metrics re: the bandwidth that live data (at the historical timestamp) was scanned at. I'd expect a big enough dip in that bandwidth indicates a production issue, and the root cause would often be pebble (e.g. an "inverted" LSM) or else infra issues (e.g. issue with disk leading to lower perf than expected). Does that seem wrong?

I hear you saying we can get a better production vet of pebble by probing based on knowledge of internal details of pebble ("pick SST" (a concept not exposed by either the kvclient API or the storage API)). I think this may be true but perhaps also the place to start is a higher level signal of cluster health, so some signal based on the kvclient API, like the one proposed above.

Agree? Disagree? Half disagree / half agree?

Also, re: "there will be multiple things we want to probe for so kvprober should bake in reads and writes", wouldn't it be the opposite? We don't want an ever-proliferating set of cluster settings and knobs, do we? Or at least we wouldn't want to commit to that too early.

Ya, you're right. My thinking got a little twisted there. We only want the complexity of multiple probe types when it's worth the costs.

tbg · 2021-07-20T14:59:02Z

Will open a separate issue with both you & storage on it, but I'm

thinking the scan prober would scan the whole range at a historical timestamp and export metrics re: the bandwidth that live data (at the historical timestamp) was scanned at. I'd expect a big enough dip in that bandwidth indicates a production issue, and the root cause would often be pebble (e.g. an "inverted" LSM) or else infra issues (e.g. disk bandwidth issues). Does that seem wrong? It doesn't seem right. If you're going to probe pebble you want to probe pebble. With a historical read through KV you will be going through KV and so that's what you're probing (you will also be much more likely to interfere with user traffic, etc). The point is less about internals vs non-internals, it's about direct access to the subsystem you want to extract information from. I'm all for probing pebble, but I don't think to go through KV for that is right, similar to how you decided against probing KV from SQL by issuing random SELECTs. The straw man canonical solution is adding an endpoint here https://github.com/cockroachdb/cockroach/blob/40029d161a803b0ee7d105ec737c0a205571a0b9/pkg/kv/kvserver/storage_services.proto#L30-L32 and the implementation does whatever the storage folks thing a good probe is, against that store's `storage.Engine`: https://github.com/cockroachdb/cockroach/blob/fe57f71323d06512bab5efe7c80a98163982e4b2/pkg/storage/engine.go#L681-L782 A separate issue for this makes sense, but I imagine that this probe would try to sync an invisible write via `eng.NewBatch().LogData([]byte("pebble-probe")` (and committing that batch), and there is probably also a read flavor to it, though with just the interface, you won't find anything interesting to read. The LSM is really the "range descriptor" at the pebble level so by analogy it does make a ton of sense to pick a few random SSTs and try to access them. I'm fairly lukewarm on the "scan the range and compute throughput" idea, we want to avoid adding work to the system as a result of probes (in particular accessing large sets of data will impact stats, user traffic, compactions, everything basically), and it's hard to form expectations about what a good throughput is since that depends heavily on the system. If a probe starts looking too much like it's trying to be an alerting rule, something is wrong.

joshimhoff · 2021-07-20T15:03:04Z

Ack. Will digest this comment a bit, think more in shower :), & start more discussion in another place. Thanks, Tobias.

joshimhoff · 2021-08-09T22:32:59Z

I am working on this now.

joshimhoff · 2021-08-10T14:45:03Z

Just put up a PR that adds a range-local key dedicated to blackbox probing: #68645.

I have a hacked POC of the "shadow write" functionality, and it appears to be working:

+++ b/pkg/kv/kvprober/kvprober.go
@@ -168,7 +168,7 @@ type dbGet interface {
 }
 
 // Doesn't return an error. Instead increments error type specific metrics.
-func (p *Prober) probe(ctx context.Context, db dbGet) {
+func (p *Prober) probe(ctx context.Context, db *kv.DB) {
        defer logcrash.RecoverAndReportNonfatalPanic(ctx, &p.settings.SV)
 
        if !readEnabled.Get(&p.settings.SV) {
@@ -203,8 +203,16 @@ func (p *Prober) probe(ctx context.Context, db dbGet) {
                // but that is okay. Even if there is no data at the key, the prober still
                // executes a basic read operation on the range.
                // TODO(josh): Trace the probes.
-               _, err = db.Get(ctx, keys.RangeProbeKey(step.StartKey))
-               return err
+               return db.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
+                       k := keys.RangeProbeKey(step.StartKey)
+                       if err := txn.Put(ctx, k, "blah"); err != nil {
+                               return err
+                       }
+                       if err := txn.Del(ctx, k); err != nil {
+                               return err
+                       }
+                       return nil
+               })
        })
        if err != nil {
                // TODO(josh): Write structured events with log.Structured.
diff --git a/pkg/kv/kvprober/kvprober_integration_test.go b/pkg/kv/kvprober/kvprober_integration_test.go
index 3f52c66899..1cc4c6ede9 100644
--- a/pkg/kv/kvprober/kvprober_integration_test.go
+++ b/pkg/kv/kvprober/kvprober_integration_test.go
@@ -122,7 +122,7 @@ func TestProberDoesReads(t *testing.T) {
                                                for _, ru := range ba.Requests {
                                                        // Planning depends on Scan so only returning an error on Get
                                                        // keeps planning working.
-                                                       if ru.GetGet() != nil {
+                                                       if ru.GetPut() != nil {
                                                                return roachpb.NewError(fmt.Errorf("boom"))
                                                        }
                                                }

I will turn this into a proper PR now. I will leave read probes intact as per the above discussion about value of having both.

@tbg

66893: cli,storage: add emergency ballast r=jbowens a=jbowens Add an automatically created, on-by-default emergency ballast file. This new ballast defaults to the minimum of 1% total disk capacity or 1GiB. The size of the ballast may be configured via the `--store` flag with a `ballast-size` field, accepting the same value formats as the `size` field. The ballast is automatically created when either available disk space is at least four times the ballast size, or when available disk space after creating the ballast is at least 10 GiB. Creation of the ballast happens either when the engine is opened or during the periodic Capacity calculations driven by the `kvserver.Store`. During node start, if available disk space is less than or equal to half the ballast size, exit immediately with a new Disk Full (10) exit code. See #66493. Release note (ops change): Add an automatically created, on by default emergency ballast file. This new ballast defaults to the minimum of 1% total disk capacity or 1GiB. The size of the ballast may be configured via the `--store` flag with a `ballast-size` field, accepting the same value formats as the `size` field. Also, add a new Disk Full (10) exit code that indicates that the node exited because disk space on at least one store is exhausted. On node start, if any store has less than half the ballast's size bytes available, the node immediately exits with the Disk Full (10) exit code. The operator may manually remove the configured ballast (assuming they haven't already) to allow the node to start, and they can take action to remedy the disk space exhaustion. The ballast will automatically be recreated when available disk space is 4x the ballast size, or at least 10 GiB is available after the ballast is created. 68645: keys/kvprober: introduce a range-local key for probing, use from kvprober r=tbg a=joshimhoff This work sets the stage for extending `kvprober` to do writes as is discussed in detail with @tbg at #67112. **keys: add a range-local key for probing** This commit introduces a range-local key for probing. The key will only be used by probing components like kvprober. This means no contention with user-traffic or other CRDB components. This key also provides a safe place to write to in order to test write availabilty. A kvprober that does writes is coming soon. Release note: None. **kvprober: probe the range-local key dedicated to probing** Before this commit, kvprober probed the start key of a range. This worked okay, as kvprober only did reads, and contention issues leading to false positive pages haven't happened in practice. But contention issues are possible, as there may be data located at the start key of the range. With this commit, kvprober probes the range-local key dedicated to probing. No contention issues are possible, as that key is only for probing. This key is also needed for write probes, which are coming soon. Release note: None. 69164: Revert "backupccl: protect entire keyspan during cluster backup" r=dt a=adityamaru This reverts commit 1b5fd4f. The commit above laid a pts record over the entire table keyspace. This did not account for two things (with the potential of there being more): 1. System tables that we do not backup could have a short GC TTL, and so incremental backups that attempt to protect from `StartTime` of the previous backup would fail. 2. Dropped tables often have a short GC TTL to clear data once they have been dropped. This change would also attempt to protect "dropped but not gc'ed tables" even though we exclude them from the backup, and fail on pts verification. One suggested approach is to exclude all objects we do not backup by subtracting these spans from {TableDataMin, TableDataMax}. This works for system tables, and dropped but not gc'ed tables, but breaks for dropped and gc'ed tables. A pts verification would still find the leaseholder of the empty span and attempt to protect below the gc threshold. In conclusion, we need to think about the semantics a little more before we rush to protect a single key span. Co-authored-by: Jackson Owens <[email protected]> Co-authored-by: Josh Imhoff <[email protected]> Co-authored-by: Aditya Maru <[email protected]>

joshimhoff · 2021-08-25T16:20:04Z

#69035

69035: kvprober: extend kvprober to test KV's write codepaths r=joshimhoff a=joshimhoff Design discussed in detail at #67112. **kvprober: extend kvprober to test KV's write codepaths** Before this commit, kvprober did point reads of range-local keys dedicated to probing. kvprober did not exercise KV's write codepaths. With this commit, kvprober exercises KV's write codepaths. kvprober has two probe loops, one that does point reads & another that commits a txn that puts and deletes a key, leaving no live data but exercising the write codepaths. Release justification: Improvement to observability used only by CC SREs. Release note: None. 69041: sqlproxyccl: use tokenbucket to throttle connection attempts r=JeffSwenson a=JeffSwenson Previously, the sqlproxy used exponential backoff to reject connection attempts if the source ip previously made an invalid connection attempt. A map was used to track previously succesful (source ip, destination cluster) pairs. The throttle was only applied to unsuccesful pairs. Now a token bucket is used to throttle connections. If a connection is succesful, the token is returned to the bucket. This should avoid throttling well behaved users and provide tools to throttle abuse. The ConnectionCache was moved into the throttle service in order to group the throttling logic into a single module. The general intent of the change is to allow more connection attempts before throttling kicks in. The main issue with the old approach is: 1. Testers would regularly lock themselves out by configuring an incorrect password. The max backoff was a too aggressive and testers would get stuck waiting minutes to hours for the throttle to end. 2. It's unclear how to handle racing connection attempts. Currently all connection attempts before the first success trigger the throttling code path. Using a token bucket naturally allows us to bound the total number of attempts the system lets through. Release note: None 69043: sql: implement `GENERATED ... AS IDENTITY` functionality for `INSERT/UPDATE/UPSERT` r=ZhouXing19 a=ZhouXing19 This commit is to implement the functionality of GENERATED {ALWAYS | BY DEFAULT} AS IDENTITY for INSERT/UPDATE/UPSERT statement. In PostgreSQL, if a column is created with GENERATED ALWAYS AS IDENTITY token, 1. it can only be updated to DEFAULT; 2. when executing INSERT on this column, the conflict cannot be resolved by `ON CONFLICT` statement; 3. it cannot be written explicitly; (i.e. cannot INSERT/UPSERT without the OVERRIDING SYSTEM VALUE token); 4. it is implicitly NOT NULL, but we must INSERT/UPSERT without specifying the value for it. If a column is created with `GENERATED BY DEFAULT AS IDENTITY` token, the above restrictions do not apply to it. Tests for optbuilder support is added in pkg/sql/opt/optbuilder/testdata. This commit also adds a type restriction for GENERATED {ALWAYS | BY DEFAULT} AS IDENTITY syntax under CREATE TABLE -- this column can only be of smallint, integer, or bigint type. This matches the PostgreSQL syntax: https://www.postgresql.org/docs/current/sql-createtable.html Release note: None Release justification: although this supports a new feature, the changes are high-benefit and have been under scrutiny for many weeks now, so after discussing with other leads we feel fine about merging this during stability period. 69296: build,ccl: quash a few outdated references to `libroach` r=rail a=rickystewart `libroach` doesn't exist any more. Release justification: Non-production code change Release note: None 69313: workload: reintroduce old `prepare` behavior r=ajwerner a=rafiss This reverts commit f446498, and makes the 'prepare' setting work more closely to how it used to. This is needed because we are seeing that the schemachange workload is deallocating huge amounts of automatically prepared statements. By explicitly preparing statements again, they should no longer be deallocated. This also disables the automatic prepared statement cache in pgx entirely, which was enabled in jackc/pgx@0c3e59b Release justification: test-only change Release note: None 69341: ccl/backupccl: make TestBackupRestoreTenant faster r=adityamaru a=sajjadrizvi TestBackupRestoreTenant/restore-tenant10-to-latest is slow due to a long adopt interval. This commit uses a testing knob to shorten test time. Release justification: low risk modification to reduce test time Release note: None Co-authored-by: Josh Imhoff <[email protected]> Co-authored-by: Jeff <[email protected]> Co-authored-by: Jane Xing <[email protected]> Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Sajjad Rizvi <[email protected]>

joshimhoff added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-observability T-kv KV Team labels Jul 1, 2021

joshimhoff self-assigned this Jul 1, 2021

joshimhoff mentioned this issue Jul 1, 2021

kvprober: implement a prober that sends queries to KV in order to generate signal on the healthiness of KV client & below #61074

Closed

17 tasks

joshimhoff added the O-sre For issues SRE opened or otherwise cares about tracking. label Jul 1, 2021

joshimhoff mentioned this issue Aug 10, 2021

keys/kvprober: introduce a range-local key for probing, use from kvprober #68645

Merged

joshimhoff mentioned this issue Aug 17, 2021

kvprober: extend kvprober to test KV's write codepaths #69035

Merged

joshimhoff closed this as completed Aug 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvprober: implement "shadow write" probes #67112

kvprober: implement "shadow write" probes #67112

joshimhoff commented Jul 1, 2021 •

edited by exalate-issue-sync bot

Loading

tbg commented Jul 1, 2021

joshimhoff commented Jul 1, 2021 •

edited

Loading

joshimhoff commented Jul 1, 2021 •

edited

Loading

tbg commented Jul 5, 2021

joshimhoff commented Jul 12, 2021

joshimhoff commented Jul 14, 2021

tbg commented Jul 14, 2021

joshimhoff commented Jul 14, 2021

tbg commented Jul 14, 2021

tbg commented Jul 14, 2021

joshimhoff commented Jul 14, 2021

joshimhoff commented Jul 20, 2021

tbg commented Jul 20, 2021

joshimhoff commented Jul 20, 2021 •

edited

Loading

tbg commented Jul 20, 2021 via email

joshimhoff commented Jul 20, 2021 •

edited

Loading

joshimhoff commented Aug 9, 2021

joshimhoff commented Aug 10, 2021

joshimhoff commented Aug 25, 2021

kvprober: implement "shadow write" probes #67112

kvprober: implement "shadow write" probes #67112

Comments

joshimhoff commented Jul 1, 2021 • edited by exalate-issue-sync bot Loading

tbg commented Jul 1, 2021

joshimhoff commented Jul 1, 2021 • edited Loading

joshimhoff commented Jul 1, 2021 • edited Loading

tbg commented Jul 5, 2021

joshimhoff commented Jul 12, 2021

joshimhoff commented Jul 14, 2021

tbg commented Jul 14, 2021

joshimhoff commented Jul 14, 2021

tbg commented Jul 14, 2021

tbg commented Jul 14, 2021

joshimhoff commented Jul 14, 2021

joshimhoff commented Jul 20, 2021

tbg commented Jul 20, 2021

joshimhoff commented Jul 20, 2021 • edited Loading

tbg commented Jul 20, 2021 via email

joshimhoff commented Jul 20, 2021 • edited Loading

joshimhoff commented Aug 9, 2021

joshimhoff commented Aug 10, 2021

joshimhoff commented Aug 25, 2021

joshimhoff commented Jul 1, 2021 •

edited by exalate-issue-sync bot

Loading

joshimhoff commented Jul 1, 2021 •

edited

Loading

joshimhoff commented Jul 1, 2021 •

edited

Loading

joshimhoff commented Jul 20, 2021 •

edited

Loading

joshimhoff commented Jul 20, 2021 •

edited

Loading