Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multitenant: allow secondary tenants to split/scatter by default #98546

Merged
merged 3 commits into from
Mar 15, 2023

Conversation

arulajmani
Copy link
Collaborator

@arulajmani arulajmani commented Mar 13, 2023

AdminSplit and AdminScatter requests are subject to capability checks.
Previously, these capabilities were codified in the "enabled" form. As
such, by default, secondary tenants did not have the ability to perform
these operations. This is in violation of what secondary tenants could
do prior to 23.1, at a time before capabilities existed. Moreover,
RESTORE/IMPORT rely on performing these operations for performance.
This made disallowing these operations by default a performance
regression.

This patch flips the phrasing of how these capabilities are stored on
the proto to use the "disable" verbiage. As such, secondary tenants are
able to perform splits and scatters by default. However, no change is
made to the public interface -- users above the tenantcapabilitiespb
package continue to interact with these capabilities as they were
before, oblivious to how these things are stored on disk.

There's a few testing changes here:

  • As part of this change, we also clean up a testing knob that was used
    by various backup, CDC, and logictests to override capability checks in
    the authorizer. This isn't required with the new default behaviour.
  • We also add some missing E2E tests for the CanAdminUnsplit capability
    which were missing when it was introduced.

Fixes #96736

Release note: None

@arulajmani arulajmani requested review from knz and ecwall March 13, 2023 23:51
@arulajmani arulajmani requested review from a team as code owners March 13, 2023 23:51
@arulajmani arulajmani requested review from benbardin and rharding6373 and removed request for a team March 13, 2023 23:51
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the simplification here.

However, I'd like to request one more step. It's good that we inverted the meaning in the protobuf, but IMHO the public interface should remain unchanged. See my comments below.

Reviewed 36 of 36 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @benbardin, @ecwall, and @rharding6373)


pkg/multitenant/tenantcapabilities/capabilities.go line 165 at r1 (raw file):

	_ CapabilityID = iota

	// DisableAdminScatter disallows a secondary tenant from being able to scatter

I recommend you undo the changes to this file. The original code is easier to read and understand.


pkg/multitenant/tenantcapabilities/tenantcapabilitiesauthorizer/authorizer.go line 85 at r1 (raw file):

			continue
		}
		switch request.Method() {

You can revert this change too. Let's keep all the checks under one conditional.


pkg/multitenant/tenantcapabilities/tenantcapabilitiesauthorizer/authorizer.go line 152 at r1 (raw file):

	// The following are authorized via specific capabilities.
	kvpb.AdminScatter: tenantcapabilities.DisableAdminScatter,
	kvpb.AdminSplit:   tenantcapabilities.DisableAdminSplit,

ditto


pkg/multitenant/tenantcapabilities/tenantcapabilitiespb/capabilities.go line 56 at r1 (raw file):

	switch capabilityID {
	case tenantcapabilities.DisableAdminScatter:
		return boolCap{&t.DisableAdminScatter}
case tenantcapabilities.CanAdminScatter:
   return invertedBoolCap{&t.DisableAdminScatter}

pkg/multitenant/tenantcapabilities/tenantcapabilitiespb/capabilities.go line 58 at r1 (raw file):

		return boolCap{&t.DisableAdminScatter}
	case tenantcapabilities.DisableAdminSplit:
		return boolCap{&t.DisableAdminSplit}

ditto


pkg/multitenant/tenantcapabilities/tenantcapabilitiespb/capabilities.go line 76 at r1 (raw file):

	switch capabilityID {
	case tenantcapabilities.DisableAdminScatter:
		return t.DisableAdminScatter
case tenantcapabilities.CanAdminScatter:
   return !t.DisableAdminScatter

ditto below


pkg/multitenant/tenantcapabilities/tenantcapabilitiespb/capabilities.proto line 34 at r1 (raw file):

  // DisableAdminSplit, if set to true, revokes the tenants ability to
  // successfully perform `AdminSplit` requests.

I recommend you group the two disable fields close to each other in the proto definition then add a comment that explains why they are different from the rest. We must teach people coming after us that any new cap should be added at the end with a default-not-granted definition.

@knz knz added the A-multitenancy Related to multi-tenancy label Mar 14, 2023
Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I quite like how this ended up looking after going with the invertedBoolField thing. RFAL.

Dismissed @knz from 5 discussions.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @benbardin, @ecwall, @knz, and @rharding6373)


pkg/multitenant/tenantcapabilities/capabilities.go line 165 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

I recommend you undo the changes to this file. The original code is easier to read and understand.

Done, but I kept some parts of the commentary.


pkg/multitenant/tenantcapabilities/tenantcapabilitiesauthorizer/authorizer.go line 85 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

You can revert this change too. Let's keep all the checks under one conditional.

Done.


pkg/multitenant/tenantcapabilities/tenantcapabilitiesauthorizer/authorizer.go line 152 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

ditto

Done.


pkg/multitenant/tenantcapabilities/tenantcapabilitiespb/capabilities.go line 56 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…
case tenantcapabilities.CanAdminScatter:
   return invertedBoolCap{&t.DisableAdminScatter}

Done.


pkg/multitenant/tenantcapabilities/tenantcapabilitiespb/capabilities.go line 58 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

ditto

Done.


pkg/multitenant/tenantcapabilities/tenantcapabilitiespb/capabilities.go line 76 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…
case tenantcapabilities.CanAdminScatter:
   return !t.DisableAdminScatter

ditto below

Done.


pkg/multitenant/tenantcapabilities/tenantcapabilitiespb/capabilities.proto line 34 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

I recommend you group the two disable fields close to each other in the proto definition then add a comment that explains why they are different from the rest. We must teach people coming after us that any new cap should be added at the end with a default-not-granted definition.

Done, and added commentary explaining the motivation behind the "disabled" verbiage.

Copy link
Contributor

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @benbardin, @knz, and @rharding6373)


pkg/sql/multitenant_admin_function_test.go line 456 at r2 (raw file):

				result: [][]string{{"\xf0\x89\x89", "/1", maxTimestamp}},
			},
			secondary: tenantExpected{

These tenantExpected blocks should not be removed since they are verifying output for certain tenant setups.


pkg/sql/multitenant_admin_function_test.go line 466 at r2 (raw file):

			},
			queryClusterSetting: sql.SecondaryTenantSplitAtEnabled,
			queryCapability:     tenantcapabilities.CanAdminSplit,

Change queryCapability to type capability struct { id CapabilityID, value bool } and set value: false for the split and scatter tests.


pkg/sql/multitenant_admin_function_test.go line 501 at r2 (raw file):

			},
			setupClusterSetting: sql.SecondaryTenantSplitAtEnabled,
			setupCapability:     tenantcapabilities.CanAdminSplit,

The setupCapability field can be removed from type testCase struct now because it was simulating split being enabled by default.

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @benbardin, @ecwall, @knz, and @rharding6373)


pkg/sql/multitenant_admin_function_test.go line 456 at r2 (raw file):

Previously, ecwall (Evan Wall) wrote…

These tenantExpected blocks should not be removed since they are verifying output for certain tenant setups.

All these test cases that have been removed have functionally been moved to the datadriven tests in pkg/ccl/multitenantccl/tenantcapabilitiesccl. Now that we have that framework, it's a much better place to test stuff like this.


pkg/sql/multitenant_admin_function_test.go line 466 at r2 (raw file):

Previously, ecwall (Evan Wall) wrote…

Change queryCapability to type capability struct { id CapabilityID, value bool } and set value: false for the split and scatter tests.

Why?

Copy link
Contributor

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @benbardin, @knz, and @rharding6373)


pkg/sql/multitenant_admin_function_test.go line 456 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

All these test cases that have been removed have functionally been moved to the datadriven tests in pkg/ccl/multitenantccl/tenantcapabilitiesccl. Now that we have that framework, it's a much better place to test stuff like this.

Can we discuss this change first? It looks like we are replacing data driven testing with a DSL that requires a lot of extra code.


pkg/sql/multitenant_admin_function_test.go line 466 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Why?

So that you can make the tests pass without removing test coverage here.

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @benbardin, @ecwall, @knz, and @rharding6373)


pkg/sql/multitenant_admin_function_test.go line 456 at r2 (raw file):

Previously, ecwall (Evan Wall) wrote…

Can we discuss this change first? It looks like we are replacing data driven testing with a DSL that requires a lot of extra code.

I tried making changes to this test for a little bit before I gave up because of how cumbersome it was. As this code churns and stuff changes, the maintainability burden of this thing is high. I've personally felt it on 2 different occasions, so I finally decided to bite the bullet and move some of the things that were in my way to the datadriven tests. Those are much earier to read and easier to change/write as well.

I'm happy to get @knz to sign off on these changes if you're not happy with them. I do want to get this thing merged soon given it's a release blocker and overall this change is positive.


pkg/sql/multitenant_admin_function_test.go line 466 at r2 (raw file):

Previously, ecwall (Evan Wall) wrote…

So that you can make the tests pass without removing test coverage here.

I see. Given we haven't lost any test coverage because things have merely moved, I think this is fine.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed elsewhere - either don't change the existing test framework structure (and follow established patterns) OR ensure this is done in a separate commit for clarity.

Reviewed 18 of 18 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @benbardin, @ecwall, and @rharding6373)


-- commits line 27 at r2:
nit cumbersome

@arulajmani arulajmani force-pushed the flip-admin-split-cap branch 3 times, most recently from 2217d22 to aa06d6f Compare March 14, 2023 15:03
Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled out the test movement changes into a separate commit, like we spoke about in other places.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @benbardin, @ecwall, @knz, and @rharding6373)


pkg/sql/multitenant_admin_function_test.go line 501 at r2 (raw file):

Previously, ecwall (Evan Wall) wrote…

The setupCapability field can be removed from type testCase struct now because it was simulating split being enabled by default.

We still make use of it, so it can't be removed.

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @benbardin, @ecwall, @knz, and @rharding6373)


pkg/sql/multitenant_admin_function_test.go line 501 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

We still make use of it, so it can't be removed.

My comment no longer applies after the changes to the Authorizer. To add more words there, earlier, we couldn't remove it because we still needed for capabilities to propagate before making these changes -- setupCapability was providing us that mechanism.

With the Authorizer changes, I tacked on a 3rd commit to clean this stuff up. Thanks for the pointer.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally like the 2nd commit. It's very good.

I am going to spend a minute more looking at the 1st and last commit. While I do this, please also contribute your thinking (with a concrete example) to #98483.

Reviewed 33 of 33 files at r3, 34 of 35 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @benbardin, @ecwall, and @rharding6373)

@knz knz mentioned this pull request Mar 14, 2023
@knz
Copy link
Contributor

knz commented Mar 14, 2023

I would like us to move this PR forward with the following change merged into it arulajmani#5 (will need a squash)

Copy link
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani, @benbardin, @ecwall, and @knz)


pkg/ccl/multitenantccl/tenantcapabilitiesccl/testdata/can_admin_unsplit line 24 at r3 (raw file):

ALTER TABLE t UNSPLIT AT VALUES (0)
----
pq: could not UNSPLIT AT (0): key /Tenant/10/Table/104/1/0 is not the start of a range

Could you also add a test that does a successful UNSPLIT?

knz and others added 3 commits March 14, 2023 19:57
Now that we have a datadriven framework available to exercise tenant
capabilities, we can convert some tests in
`TestMultiTenantAdminFunction` to make use of this. This patch
adds SPLIT/SCATTER related tests to make use of the datadriven
test framework.

As is, the construction is cumbersome to work with. The move is done
with the next commit in mind, where we will change the default behavior
of split/scatter capabilities.

Release note: None

Co-authored-by: Raphael 'kena' Poss <[email protected]>
AdminSplit and AdminScatter requests are subject to capability checks.
Previously, these capabilities were codified in the "enabled" form. As
such, by default, secondary tenants did not have the ability to perform
these operations. This is in violation of what secondary tenants could
do prior to 23.1, at a time before capabilities existed. Moreover,
RESTORE/IMPORT rely on performing these operations for performance.
This made disallowing these operations by default a performance
regression.

This patch flips the phrasing of how these capabilities are stored on
the proto to use the "disable" verbiage. As such, secondary tenants are
able to perform splits and scatters by default. However, no change is
made to the public interface -- users above the `tenantcapabilitiespb`
package continue to interact with these capabilities as they were
before, oblivious to how these things are stored on disk.

As part of this patch, we also clean up a testing knob that was used by
various backup, CDC, and logictests to override capability checks in the
Authorizer. We no longer need this with the new defaults.

Fixes cockroachdb#96736

Release note: None
@knz
Copy link
Contributor

knz commented Mar 14, 2023

We will now confirm this passes CI then we can merge the PR. Thanks!

@knz
Copy link
Contributor

knz commented Mar 14, 2023

thanks to all for sharing the work on this!

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 14, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 14, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 15, 2023

Build succeeded:

@craig craig bot merged commit 1bb2934 into cockroachdb:master Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy
Projects
None yet
5 participants