Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multitenant: Investigate tests which fail when run using the default test tenant #76378

Open
ajstorm opened this issue Feb 10, 2022 · 11 comments
Labels
A-multitenancy Related to multi-tenancy C-investigation Further steps needed to qualify. C-label will change. db-cy-23 T-multitenant Issues owned by the multi-tenant virtual team

Comments

@ajstorm
Copy link
Collaborator

ajstorm commented Feb 10, 2022

A change was made with #76582 to enable all tests which leverage testServer/testCluster to run with some probability in multi-tenant mode. In doing so, several tests were found to fail or hang when run under multi-tenant mode. This tracking issue enumerates all of the problematic test cases so that they can be investigated in the future.

Jira issue: CRDB-13093

Remaining tests that specify the parameter TODOTestTenantDisabled

By product area:

Disaster recovery

  • 49 pkg/ccl/backupccl
  • 14 pkg/ccl/streamingccl
  • 2 pkg/ccl/jobsccl

KV

  • 9 pkg/ccl/spanconfigccl
  • 5 pkg/kv/kvserver
  • 2 pkg/spanconfig/spanconfigkvaccessor
  • 1 pkg/kv/kvclient/rangefeed/rangefeedcache

Mixed bag: Obs Inf - DB Server - Multi-tenant

  • pkg/server - SQL inspection APIs
  • pkg/server - debug API tests
  • pkg/server - server controller tests
  • 32 pkg/ccl/serverccl
  • 1 pkg/ts
  • 1 pkg/upgrade/upgrademanager
  • 1 pkg/util/tracing
  • 1 pkg/server/systemconfigwatcher
  • 1 pkg/server/structlogging

Multi-tenant

  • 9 pkg/ccl/multitenantccl
  • 4 pkg/kv/kvclient/kvtenant
  • 2 pkg/configprofiles
  • 2 pkg/base
  • 1 pkg/ccl/kvccl/kvtenantccl/upgradeinterlockccl
  • 1 pkg/multitenant/tenantcapabilities
  • 1 pkg/ccl/testutilsccl

SQL / Migrations

  • 18 pkg/sql/importer
  • 5 pkg/sql
  • 5 pkg/ccl/partitionccl
  • 5 pkg/ccl/multiregionccl
  • 4 pkg/ccl/testccl
  • 4 pkg/sql/schemachanger
  • 4 pkg/bench
  • 3 pkg/cli/democluster
  • 2 pkg/ccl/workloadccl
  • 2 pkg/ccl/importerccl
  • 1 pkg/sql/ttl
  • 1 pkg/sql/tests
  • 1 pkg/sql/sqltestutils
  • 1 pkg/sql/sqlinstance
  • 1 pkg/sql/copy
  • 1 pkg/sql/colenc
  • 1 pkg/sql/catalog

CDC

  • 10 pkg/ccl/changefeedccl
  • 9 pkg/ccl/changefeedccl that are disabled via feedTestNoTenants option

Serverless

  • 11 pkg/ccl/sqlproxyccl

Remaining tests that were hidden due to the CCL import (#108761)

Disaster recovery / KV

  • 34 kv/bulk

Disaster recovery

  • 61 jobs

CDC

  • 8 ccl/changefeedccl/cdceval
  • 7 ccl/changefeedccl/cdcevent
  • 3 kv/kvclient/rangefeed
  • 1 kv/kvclient/rangefeed/rangefeedcache

KV

  • 81 kv/kvserver
  • 27 kv/kvclient/kvcoord
  • 22 kv
  • 8 util/rangedesc
  • 8 spanconfig/spanconfigkvsubscriber
  • 7 spanconfig/spanconfigkvaccessor
  • 3 kv/kvnemesis
  • 2 spanconfig/spanconfigmanager
  • 2 kv/kvserver/batcheval
  • 2 kv/kvserver/intentresolver
  • 1 spanconfig/spanconfigsqlwatcher
  • 1 kv/kvclient
  • 1 kv/kvserver/loqrecovery
  • 1 kv/kvserver/reports

SQL Queries

  • 9 kv/kvclient/kvstreamer
  • 8 ccl/multitenantccl/tenantcostserver
  • 7 sql/row
  • 3 sql/stmtdiagnostics
  • 3 sql/physicalplan
  • 2 sql/sqlliveness/slstorage
  • 2 sql/sqlinstance/instancestorage
  • 1 sql/stats
  • 1 sql/scheduledlogging
  • 1 sql/rowexec
  • 1 sql/colflow
  • 1 sql/colexec/colbuilder

SQL Foundations

  • 81 sql
  • 10 sql/catalog/descs
  • 9 sql/catalog/resolver
  • 4 workload/insights
  • 4 workload/bank
  • 2 sql/ttl/ttljob
  • 2 sql/sem/builtins
  • 2 sql/catalog/lease
  • 1 sql/catalog/tabledesc
  • 1 sql/catalog/schematelemetry
  • 1 sql/gcjob_test

SQL (mixed)

  • 8 sql/logictest/tests/local-vec-off
  • 8 sql/logictest/tests/local-legacy-schema-changer
  • 7 sql/logictest/tests/fakedist
  • 7 sql/logictest/tests/fakedist-disk
  • 5 sql/logictest/tests/fakedist-vec-off
  • 5 sql/logictest/tests/5node-disk
  • 4 sql/logictest/tests/local-mixed-22.2-23.1
  • 4 sql/logictest

Multi-tenant

  • 4 multitenant/tenantcapabilities/tenantcapabilitieswatcher
  • 2 server/settingswatcher
  • 2 server/tenantsettingswatcher
  • 1 server/systemconfigwatcher
  • 1 server/autoconfig

Obs Infra

  • 24 server
  • 3 obsservice/obslib/ingest
  • 2 util/log/logcrash

Clust Obs

  • 10 server/application_api
  • 1 server/storage_api

Misc

  • 40 upgrade/upgrades
  • 1 security
@ajstorm ajstorm added C-investigation Further steps needed to qualify. C-label will change. A-multitenancy Related to multi-tenancy labels Feb 10, 2022
ajstorm added a commit to ajstorm/cockroach that referenced this issue Feb 14, 2022
Previously, testServer would run by default in single-tenant mode.  This PR
changes testServer to run by default in multi-tenant mode, and runs all
statements through the default test tenant. Since we want the bulk of our
testing to remain in single-tenant mode, we only probabilistically run with
tenants (with probability 0.25). This commit also contains a few pieces of
cleanup which were necessary to make the testing changes (and were difficult to
split into a separate commit):

  - Change the "Existing" flag on tenant start to "DisableCreateTenant" and
    make the tenant start code more tolerant of existing tenants by checking
    to see if the tenant exists before trying to create it. The
    DisableCreateTenant flag is required for the cases where we want a test to
    fail due to the lack of a created tenant.
  - Changed the multi-region backup tests to use node 0 instead of node 1, to
    allow them to work within a tenant. Tenants aren't able to access remote
    nodes of nodelocal stoarge.
  - Cleaned up a couple of cases where our error handling was passing nil
    structs to be printed out.
  - Had to special case the setting of kv.range_merge.queue_enabled because
    it is only available to the system SQL server. As a result, all tests which
    check for its setting had to be modified to no longer check for it (since
    it will only be set in the case where we have no SQL server).

Additionally, we've created a tracking issue for all tests which currently
don't work under the default test tenant (cockroachdb#76378).

Release note: None
ajstorm added a commit to ajstorm/cockroach that referenced this issue Feb 14, 2022
Previously, testServer would run by default in single-tenant mode.  This PR
changes testServer to run by default in multi-tenant mode, and runs all
statements through the default test tenant. Since we want the bulk of our
testing to remain in single-tenant mode, we only probabilistically run with
tenants (with probability 0.25). This commit also contains a few pieces of
cleanup which were necessary to make the testing changes (and were difficult to
split into a separate commit):

  - Change the "Existing" flag on tenant start to "DisableCreateTenant" and
    make the tenant start code more tolerant of existing tenants by checking
    to see if the tenant exists before trying to create it. The
    DisableCreateTenant flag is required for the cases where we want a test to
    fail due to the lack of a created tenant.
  - Changed the multi-region backup tests to use node 0 instead of node 1, to
    allow them to work within a tenant. Tenants aren't able to access remote
    nodes of nodelocal stoarge.
  - Cleaned up a couple of cases where our error handling was passing nil
    structs to be printed out.
  - Had to special case the setting of kv.range_merge.queue_enabled because
    it is only available to the system SQL server. As a result, all tests which
    check for its setting had to be modified to no longer check for it (since
    it will only be set in the case where we have no SQL server).

Additionally, we've created a tracking issue for all tests which currently
don't work under the default test tenant (cockroachdb#76378).

Release note: None
ajstorm added a commit to ajstorm/cockroach that referenced this issue Feb 15, 2022
Previously, testServer would run by default in single-tenant mode.  This PR
changes testServer to run by default in multi-tenant mode, and runs all
statements through the default test tenant. Since we want the bulk of our
testing to remain in single-tenant mode, we only probabilistically run with
tenants (with probability 0.25). This commit also contains a few pieces of
cleanup which were necessary to make the testing changes (and were difficult to
split into a separate commit):

  - Change the "Existing" flag on tenant start to "DisableCreateTenant" and
    make the tenant start code more tolerant of existing tenants by checking
    to see if the tenant exists before trying to create it. The
    DisableCreateTenant flag is required for the cases where we want a test to
    fail due to the lack of a created tenant.
  - Changed the multi-region backup tests to use node 0 instead of node 1, to
    allow them to work within a tenant. Tenants aren't able to access remote
    nodes of nodelocal stoarge.
  - Cleaned up a couple of cases where our error handling was passing nil
    structs to be printed out.
  - Had to special case the setting of kv.range_merge.queue_enabled because
    it is only available to the system SQL server. As a result, all tests which
    check for its setting had to be modified to no longer check for it (since
    it will only be set in the case where we have no SQL server).

Additionally, we've created a tracking issue for all tests which currently
don't work under the default test tenant (cockroachdb#76378).

Release note: None
craig bot pushed a commit that referenced this issue Aug 2, 2023
104228: dev: replace `stress` with `--runs_per_test` r=rail a=rickystewart

`--runs_per_test` does the same thing that `stress` does (runs a test
many times to weed out flakes), but better, in that:

1. Better Bazel support -- we had to hack Bazel support into `stress`
   to keep it working
2. Bazel gives us statistics and control over how the tests are
   scheduled in a way that is standardized as opposed to the ad-hoc
   arguments one can pass to `stress`
3. Bazel can collect logs and artifacts for different test runs and
   expose them to the user in a way that `stress` cannot
4. Drop-in support for remote execution when we have access to it,
   obviating the need for `roachprod-stress`

For now, the default implementation of `dev test --stress` is to
run the test 1,000 times, stopping the build if any test fails.
This is meant to replicate how people use `stress` (i.e., just start it
running and stop if there are any failures). The 1,000 figure is
arbitrary and meant to just be a "very high number" of times to run unit
tests. The default figure of 1,000 can be adjusted with `--count`. Also
update documentation with some new recipes and add extra logging to
warn about the change in behavior.

Closes #102879

Epic: none
Release note: None

107765: sql: remove calls to CreateTestServerParams r=yuzefovich a=yuzefovich

This PR removes calls to `tests.CreateTestServerParams` outside of `sql_test` tests as well as does some miscellaneous cleanups along the way. The rationale for doing this is that vast majority of callers didn't actually need the setup performed by the helper but inherited the disabling of test tenant. It seems more prudent for each test to be explicit about the kind of testing knobs and settings it needs.

Informs: #76378.
Epic: CRDB-18499.

Release note: None

107889: concurrency: enable joint claims on the request scan path r=nvanbenschoten a=arulajmani

This patch addresses a TODO that was blocking joint claims on a request's scan path and adds a test to show joint claims work. In particular, previously the lock mode of a queued request was hardcoded to use locking strength `lock.Intent`. We now use the correct lock mode by snapshotting an immutable copy of the request's lock strength on to the queuedGuard when inserting a it into a lock's wait queue.

Informs #102275

Release note: None

108052: changefeedccl: Release allocation when skipping events r=miretskiy a=miretskiy

The changefeed (or KV feed to be precise) may skip some events when "scan boundary" is reached.
Scan boundary is a timestamp when certain event occurs -- usually a schema change.  But, it may also occur
when the `end_time` option is set.

The KV feed ignores events that have MVCC timestamp greater or equal to the scan boundary event.

Unfortunately, due to a long outstanding bug, the memory allocation associated with the event would not be released when KV feed decides to skip the event.

Because of this, allocated memory was "leaked" and not reclaimed. If enough additional events arrive, those leaked events may account for all of the memory budget, thus leading to inability for additional events to be added.

This bug impacts any changefeeds running with the `end_time` option set.  It might also impact changefeeds that observe normal schema change event, though this situation is highly unlikely(the same transaction that perform schema change had to have modified sufficient number of rows in the table to fill up all of the memory budget).

Fixes #108040

Release note (enterprise change): Fix a potential "deadlock" when running changefeed with `end_time` option set.

108066: dev: set `COCKROACH_DEV_LICENSE` for `compose` r=rail a=rickystewart

This test won't run without the license key set.

Epic: CRDB-17171
Release note: None

Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Yevgeniy Miretskiy <[email protected]>
craig bot pushed a commit that referenced this issue Aug 14, 2023
108628: ui: user cluster settings from redux r=maryliag a=maryliag

Part Of #108373

https://www.loom.com/share/e8b2bc222db848f7a55d442c23c31fe6

Use the value of the cluster setting
`sql.index_recommendation.drop_unused_duration` from redux, instead of adding as part of the select.
With this change, now users with VIEWACTIVITY or
VIEWACTIVITYREDACTED can see index recommendations on the console, without the need the view cluster settings permission.
This commit fixes on api from Database pages (Database Details and Table Details).

Release note: None

108636: tests: rename `StartNewTestCluster` to `StartCluster` r=yuzefovich a=knz

Part of #107986.
Epic: CRDB-18499

This follows for symmetry with `StartServer`.

This is a simple searc-replace substitution:

`serverutils.StartNewTestCluster` -> `serverutils.StartCluster`

Release note: None

108721: sqlccl: remove base.TODOTestTenantDisabled r=yuzefovich a=yuzefovich

In all of the touched tests we control the tenants explicitly. Also add some log scopes.

Informs: #76378.
Epic: CRDB-18499.

Release note: None

108734: ui: fix filter font size r=maryliag a=maryliag

On CC Console, the font size of the filter was
using a wrong value inherited from another class.
This commit makes the value explicit to be consistent.

Before
<img width="623" alt="Screenshot 2023-08-14 at 3 08 24 PM" src="https://github.com/cockroachdb/cockroach/assets/1017486/59bc6306-642e-4fbb-947f-500dd335ec10">


After
<img width="1187" alt="Screenshot 2023-08-14 at 3 07 39 PM" src="https://github.com/cockroachdb/cockroach/assets/1017486/27034065-97a5-40a5-863d-231debc6faad">


Epic: none
Release note: None

Co-authored-by: maryliag <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
craig bot pushed a commit that referenced this issue Aug 15, 2023
108739: roachtest/costfuzz: create failure.log file on demand r=yuzefovich a=yuzefovich

This commit makes it so that `failure.log` files are created lazily on demand in `costfuzz` and `unoptimized_query_oracle` roachtests. It seems nicer this way since for successful runs previously we'd see empty failure log files which were a bit confusing.

Epic: None

Release note: None

108744: build: update aws-cli r=pavelkalinnikov,srosenberg,herkolategan,rail a=renatolabs

The version being previously used was quite old, and didn't support setting the `Throughput` parameter on `gp3` volumes.

See: #108629 (comment).

Epic: none

Release note: None

108748: CODEOWNERS: split sql-queries-prs from sql-queries r=knz,rytaft a=michae2

This allows us to add or remove people from the PR rotation while keeping them on the team.

Epic: None
Release note: None

108752: ui: fix errors thrown by 0.toNumber() r=sjbarag a=sjbarag

When generating JS protobuf bindings, protobufjs's `pbjs` CLI accepts a --strict-long (or the more modern --force-long) flag with the following description:

  > Enfores the use of 'Long' for s-/u-/int64 and s-/fixed64 fields.

This uses the `long` package on NPM[^1], which it expects to be present when one of those clients is used at runtime. If `long` isn't available when `pbjs` is executed, it falls back to generating number-based fields where possible. Long is available in all consumers of `crdb-protobuf-client`, and `protobufjs` declares `long` as one of its few build-time and run-time dependencies, so nothing looked alarming.

For each field with a type that might be represented by a Long (a heuristic influenced by --strict-long), that field is marked as "long" if and only if the protobufjs-internal util.Long is non-null[^2]. That util.Long value is set via the use of the ``@protobufjs/inquire`` package instead of a standard `require`[^3] though, since the former silently returns `null` instead of throwing an error[^4]. This, again, is not alarming on its own because `protobufjs` depends directly on `long`.

The complication comes in with rules_js' use of the pnpm visibility rules, which are very strict when it comes to JS package dependencies. Put simply: packages only have access to the dependencies they directly declare, not transitive dependencies ("phantom dependencies")[^5]. Since ``@protobufjs/inquire`` is a separate package with no direct dependencies of its own[^6], Bazel lays out a node_modules tree where ``@protobufjs/inquire`` has no access to the `long` package. Since inaccessible packages are indistinguishable from nonexistent packages and `inquire` is intended to be silent, it returns `null`. The `pbjs` CLI then assumes Long isn't installed and falls back to number-based fields where a Long isn't strictly required.

When combined with protobuf's zero-value, this caused several cases of '0.toNumber is not a function' JS errors at runtime[^7]. We requested a Long, we support Longs, but a client was silently generated without respecting --strict-long.

Like with most other phantom dependencies, use pnpm.packageOverrides to declare that `@protobufjs/inquire` does in fact depend on Long.

[^1]: https://www.npmjs.com/package/long
[^2]: https://github.com/protobufjs/protobuf.js/blob/918ff014efe19f3eb43195ae3d71f7aeb3fcdd73/src/field.js#L153-L157
[^3]: https://github.com/protobufjs/protobuf.js/blob/918ff014efe19f3eb43195ae3d71f7aeb3fcdd73/src/util/minimal.js#L163-L167
[^4]: https://github.com/protobufjs/protobuf.js/blob/918ff014efe19f3eb43195ae3d71f7aeb3fcdd73/lib/inquire/index.js#L4-L17
[^5]: https://docs.aspect.build/rules/aspect_rules_js/docs/pnpm/#hoisting
[^6]: https://github.com/protobufjs/protobuf.js/blob/918ff014efe19f3eb43195ae3d71f7aeb3fcdd73/lib/inquire/package.json
[^7]: #106318

Fixes: #106318
Release note: Observability pages no longer crash when they encounter zeros (e.g. a session with no memory allocated).

108764: sql/physicalplan: make more tests compatible with secondary tenants  r=maryliag,yuzefovich a=knz

Informs #76378.
Links to #108763.
Epic: CRDB-18499

108765: sql: make TestScatterResponse work with secondary tenants r=yuzefovich a=knz

First two commits from #108764
Informs #76378.
Epic: CRDB-18499

Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Renato Costa <[email protected]>
Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: Sean Barag <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
craig bot pushed a commit that referenced this issue Aug 17, 2023
…108925

107863: sql: add bitwise operation for varbit with variable length r=rharding6373 a=cty123

As raised in #107821, the existing bitwise operations in CRDB follow the same standard as Postgresql that require the two operands of varbit type to have the same length. The restriction makes it easier for the implementation but harder for users. As of today, we need to apply casting and left padding to make sure the operands have the same length. Instead, it might be useful to have CRDB built-in functions that can handle this. Here I have implemented 2 basic bitwise operation functions that are tailored for varbit typed data of arbitrary length.

108525: sql: TestRandomSyntaxFunctions: skip function call and fix CREATE STATS use of span after finish r=fqazi a=fqazi

Previously, this test could depending on the concurrency,
could spawn a large number of schema telemetry jobs via
(ctdb_internal.create_sql_schema_telemetry_job).
This could lead to contention that will eventually
cause this test to time out. To address this, this
patch limits calling the telemetry job creation function.

Fixes: #108153

Release note: None

108807: server,log: report tenant names in logging output r=stevendanna,abargainier a=knz

Epic: CRDB-27982
Fixes #103406.

Prior to this change, the tenant details reported in logging output
were limited to just the tenant ID. This is because the original
tenant server initialization code only had access to the tenant
ID (provided as CLI argument).

Since recently, the tenant name is also becoming known during tenant
server initialization. This is currently done via the tenant
connector.

This commit expands on this foundation by injecting the tenant name
into the logging output as soon as it is available.

For example:

```
I230815 19:31:07.290757 922 sql/temporary_schema.go:554 ⋮ [T4,Vblah,n1] 148  found 0 temporary schemas
                                                              ^^^^^ here
```

or using JSON:
```
{"channel_numeric":0,"channel":"DEV",...,"tenant_id":4,"tenant_name":"blah",...}
                                                       ^^^^^^^^^^^^^^^^^^^^ here
```


Note: we are choosing to report the tenant name *in addition* to the
tenant ID to preserve compatibilit with automation (eg. logspy) which
needs to filter entries based on ID.


Release note (cluster virtualization): The name of the virtual
cluster (tenant), when known, is now reported in logging events.
Refer to the documentation of individual logging format for details.

108823: row: remove no longer nil txn check r=yuzefovich a=yuzefovich

I believe this non-nil txn check is no longer needed as of 8f7f2f4 (which fixed the only known place where we forgot to set the txn field on the FlowCtx).

Epic: None

Release note: None

108858: *: bump some core dependencies r=knz a=rickystewart

Unfortunately we're still getting `nogo` crashes. Try bumping these
dependencies even further.

Informs #99988.

Epic: none
Release note: None

108866: storage: Write sstables in v4 format for shared ingestions r=RaduBerinde a=itsbilal

We require sstables to be written with obsolete bits for shared sstable ingestion to work correctly. This change moves to writing sstables with the new format if the cluster version is high enough.

Unblocks #107394.

Epic: none

Release note: None

108886: kvstreamer, row: adjust all tests to work with the test tenant r=yuzefovich a=yuzefovich

Epic: CRDB-18499
Informs #76378

Release note: None

108924: logictest: avoid FATAL during cleanup r=rafiss a=rafiss

This prevents the test from failing if there is an error closing a connection at the end of the test. There is no strong need for us to assert that closing the connection worked.

fixes #108833
Release note: None

108925: streamingest: speed-up c2c cutover r=lidorcarmel a=lidorcarmel

Each c2c processor checks the job info table every 30 seconds to see whether a cutover has started. Normally a cutover should take about 10 seconds. This pr changes the default to 10 seconds, to reduce the total time for cutting over.

Epic: none

Release note: None

Co-authored-by: cty123 <[email protected]>
Co-authored-by: Faizan Qazi <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Bilal Akhtar <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Lidor Carmel <[email protected]>
craig bot pushed a commit that referenced this issue Aug 25, 2023
108597: metrics: assign histogram metric type on histogram construction r=ericharmeling a=ericharmeling

This commit assigns prometheusgo.MetricType_HISTOGRAM to the
Metadata.MetricType on histogram construction.

Before this change, GetMetadata() was returning the
Metadata.MetricType zero value (prometheusgo.MetricType_COUNTER)
for all histograms that did not explicitly specify the
prometheusgo.MetricType_HISTOGRAM for Metadata.MetricType in
their Metadata definitions. This prevented checks on histogram
Metadata.MetricType from properly evaluating the metrics as
histograms.

Fixes #106448.
Fixes #107701.

Releaes note: None

109345: changefeedccl: refactor kvfeed startup in changeaggregator processor  r=miretskiy a=jayshrivastava

changefeedccl: refactor kvfeed startup in changeaggregator processor
This change cleans up the code used to start up the kv feed
in change aggregator processors. This change removes uncessessary code,
adds a better API, and makes code easier to reason about.

Informs: #96953
Release note: None
Epic: None

109386: sql: adjust many tests to work with test tenant r=yuzefovich a=yuzefovich

Epic: CRDB-18499
Informs #76378

Release note: None

109476: dev: make `dev test --count 1` invalidate cached test results r=dt a=rickystewart

This matches the behavior of `go test`.

Epic: none
Release note: None

109506: changefeedccl: ensure rangefeed setting is enabled in tests r=miretskiy a=jayshrivastava

Previously, many tests which create rangefeeds would not explicitly set the `kv.rangefeed.enabled` setting to be true. These tests would still work because, by default, rangefeeds are enabled via span configs. However, it was observed that span configs are not immediately applied when range splits occur. This would cause the testing rangefeed reader to encounter errors and/or timeout on very rare occasions. See #109306 (comment) for more info.

This change updates these tests to set the `kv.rangefeed.enabled` cluster setting to be true, which removes the dependency on span configs.

Closes: #109306
Epic: None
Release note: None

109511: concurrency: use generic lists in the lock table r=nvanbenschoten a=arulajmani


Now that cda4fa2 has landed, we can make use of generic lists in a few places in the lock table.

Epic: none
Release note: None

Co-authored-by: Eric Harmeling <[email protected]>
Co-authored-by: Jayant Shrivastava <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Arul Ajmani <[email protected]>
craig bot pushed a commit that referenced this issue Aug 27, 2023
109329: *: make more tests compatible with cluster virtualization r=yuzefovich a=knz

Epic: CRDB-18499
Informs #76378.
Prerequisite to #108762.

(I manually tested this works with #108762)

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
@knz knz added T-multitenant Issues owned by the multi-tenant virtual team and removed T-testeng TestEng Team C-test-failure Broken test (automatically or manually discovered). labels Sep 13, 2023
craig bot pushed a commit that referenced this issue Sep 14, 2023
110518: row: adjust TestRowFetcherMVCCMetadata to work with secondary tenants r=yuzefovich a=yuzefovich

The problem was that we previously iterated over the whole key space (beyond the tenant's boundaries) which then we couldn't correctly decode using the row fetcher.

Fixes: #109396.
Informs: #76378.

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
stevendanna added a commit to stevendanna/cockroach that referenced this issue Oct 11, 2023
Almost all streamingccl manage tenants explicitly. In a few cases the
test technically works in a secondary tenant, but I haven't gone out
of my way to find more that might.

Informs cockroachdb#76378

Release note: None
craig bot pushed a commit that referenced this issue Oct 16, 2023
112165: streamingccl: remove uses of base.TODOTestTenantDisabled r=msbutler a=stevendanna

Almost all streamingccl manage tenants explicitly. In a few cases the test technically works in a secondary tenant, but I haven't gone out of my way to find more that might.

Informs #76378

Release note: None

Co-authored-by: Steven Danna <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy C-investigation Further steps needed to qualify. C-label will change. db-cy-23 T-multitenant Issues owned by the multi-tenant virtual team
Projects
None yet
Development

No branches or pull requests

6 participants