Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc: pgwire-compatible query cancellation #75870

Merged
merged 1 commit into from
Feb 9, 2022

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Feb 2, 2022

refs #41335

Release note: None

@rafiss rafiss requested review from knz, ajwerner, petermattis and a team February 2, 2022 16:59
@rafiss rafiss requested a review from a team as a code owner February 2, 2022 16:59
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @petermattis, and @rafiss)


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 16 at r1 (raw file):

## Summary

The Postgres query cancel protocol is hard to implement since it only uses 64

I think the summary needs to start with a sentence that says what the query cancel protocol is and why it's important.


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 19 at r1 (raw file):

bits of data as an identifier, and is sent over a separate (unauthenticated)
connection, different from the SQL connection. For dedicated clusters, these 64
bits of data need to identify a node and query to cancel. We can use 32 bits to

node and session. The query is implicit (current query).

Ben says we should think about the query, but I disagree. This is explained here:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20220122_cancel_rfc.md#target-the-query-in-the-request-not-the-session


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 91 at r1 (raw file):

#### SQL Node Changes

The connExecutor will be updated to generate a random 32-bit integer (secretID)

Technically, we only need a secretID per connExec that's associated with a client SQL session, not for those used inside InternalExecutor.


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 97 at r1 (raw file):

BackendKeyData that is sent back to the client.

The status server will have a new endpoint named PGWireCancelQuery, analogous to

I'd call it CancelQueryByKey, no need to say "pg" in there.


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 108 at r1 (raw file):

cancellation requests. If a cancel request fails, which is likely to only happen
if an attacker is spamming requests, it will be penalized by holding onto the
semaphore for an extra second. If we set the concurrency of the semaphore to 256

You have not explained what this semaphore is yet at this point.

Also, as explained in the PR review already, I think the semaphore should be at the ingress, on the first sql server that receives the cancel request over pgwire, not in the target handler of the RPC; otherwise an attacker can flood the pod-to-pod network with bogus requests.

See https://github.com/cockroachdb/cockroach/pull/75211/files#diff-761942b79b747d28c579a01c4862c3a8300605c8be65b2a6e52709760849abc9R32


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r1 (raw file):

When a proxy sees a CancelRequest, it first extracts the IP address component of
the message. If needed, it will forward the request to the proxy with that
address using an RPC that is only for proxy-proxy communication. This RPC

This proxy-to-proxy communication is new. The RFC should highlight this, and contain a sub-section that explains how proxies will authenticate themselves to each other.

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed this RFC at it matches up with my understanding of what was discussed elsewhere. I'll leave the detailed review to others.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @rafiss)

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 94 at r1 (raw file):

when it is initialized, and register it with the server’s
[SessionRegistry](https://github.com/cockroachdb/cockroach/blob/a434c8418c36dbeb64e73588bcd4dd5b248c3238/pkg/sql/conn_executor.go#L1692).
The 32-bit SQLInstanceID and the random 32 bit secretID will be used as the

In a multi-tenant setting, SQLInstanceID should never be more than 15 bits. We need this to be true for unique_rowid to remain sound. We have an outstanding problem whereby we use the node ID as the instance ID indiscriminately. We potentially ought to change the dedicated servers to utilize the same leasing protocol used in multi-tenant settings for unique_rowid() and potentially this as well. If we do so, we can have a bunch more bits for the secret, which will make the argument more compelling. I suggest we leave that as future work, because it's a lot to stake on this project right now, but it doesn't seem too far off.

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 16 at r1 (raw file):

Previously, knz (kena) wrote…

I think the summary needs to start with a sentence that says what the query cancel protocol is and why it's important.

good point


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 19 at r1 (raw file):

Previously, knz (kena) wrote…

node and session. The query is implicit (current query).

Ben says we should think about the query, but I disagree. This is explained here:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20220122_cancel_rfc.md#target-the-query-in-the-request-not-the-session

nice catch


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 91 at r1 (raw file):

Previously, knz (kena) wrote…

Technically, we only need a secretID per connExec that's associated with a client SQL session, not for those used inside InternalExecutor.

true


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 94 at r1 (raw file):

Previously, ajwerner wrote…

In a multi-tenant setting, SQLInstanceID should never be more than 15 bits. We need this to be true for unique_rowid to remain sound. We have an outstanding problem whereby we use the node ID as the instance ID indiscriminately. We potentially ought to change the dedicated servers to utilize the same leasing protocol used in multi-tenant settings for unique_rowid() and potentially this as well. If we do so, we can have a bunch more bits for the secret, which will make the argument more compelling. I suggest we leave that as future work, because it's a lot to stake on this project right now, but it doesn't seem too far off.

i think we can still use more bits in basically all cases, without needing to do that future work. Thanks to @knz for the idea back in his initial review of my PR from 6 months ago

I'm including something like this in the next revision of my original PR

const (
	lower32BitsMask uint64 = 0x00000000FFFFFFFF
	lower52BitsMask uint64 = 0x000FFFFFFFFFFFFF
	leadingBitMask         = 1 << 63
)

func MakeBackendKeyData(rng *rand.Rand, sqlInstanceID base.SQLInstanceID) BackendKeyData {
	ret := rng.Uint64()
	if sqlInstanceID < 1<<11 {
		// Only keep the lower 52 bits
		ret = ret & lower52BitsMask
		// Set the upper 12 bits based on the sqlInstanceID.
		ret = ret | (uint64(sqlInstanceID) << 52)
	} else {
		// Only keep the lower 32 bits.
		ret = ret & lower32BitsMask
		// Set the leading bit.
		ret = ret | leadingBitMask
		// Set the other upper 31 bits based on the sqlInstanceID.
		ret = ret | (uint64(sqlInstanceID) << 32)
	}
	return BackendKeyData(ret)
}

docs/RFCS/20220202_pgwire_compatible_cancel.md, line 97 at r1 (raw file):

Previously, knz (kena) wrote…

I'd call it CancelQueryByKey, no need to say "pg" in there.

donce


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 108 at r1 (raw file):

Previously, knz (kena) wrote…

You have not explained what this semaphore is yet at this point.

Also, as explained in the PR review already, I think the semaphore should be at the ingress, on the first sql server that receives the cancel request over pgwire, not in the target handler of the RPC; otherwise an attacker can flood the pod-to-pod network with bogus requests.

See https://github.com/cockroachdb/cockroach/pull/75211/files#diff-761942b79b747d28c579a01c4862c3a8300605c8be65b2a6e52709760849abc9R32

definitely agree about adding the semaphore at the ingress. i'm already incorporating that in the next revision of my original PR, which i'll publish soon

will update RFC


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r1 (raw file):

Previously, knz (kena) wrote…

This proxy-to-proxy communication is new. The RFC should highlight this, and contain a sub-section that explains how proxies will authenticate themselves to each other.

will add a proxy-proxy communication highlight.

i'm not sure how proxies will authenticate themselves to each other. would it be sufficient to say that they don't need to if they are all in the same private subnet? (proxies aren't exposed to the internet, since they are behind a cloud load balancer)

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @rafiss)


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r1 (raw file):

would it be sufficient to say that they don't need to if they are all in the same private subnet?

heh, no. Defense in depth, remember? If private subnets were sufficient, we wouldn't care about using TLS throughout the CC infrastructure. And yet we do.

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r1 (raw file):

Previously, knz (kena) wrote…

would it be sufficient to say that they don't need to if they are all in the same private subnet?

heh, no. Defense in depth, remember? If private subnets were sufficient, we wouldn't care about using TLS throughout the CC infrastructure. And yet we do.

got it, added more details now

@rafiss rafiss requested a review from a team February 8, 2022 22:42
Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

things here make sense, must've been a lot of discussions :)

i think a few diagrams here would make it easier to digest.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @rafiss)


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 11 at r3 (raw file):

The proposal here is entirely dependent on many thoughtful discussions and prior
work with Jordan Lewis, knz, Andrew Werner, Andy Kimball, Peter Mattis, Ben

haha that's not a full name :')


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 48 at r3 (raw file):

Nearly all Postgres drivers support the Postgres query cancellation protocol.
For example, the PGJDBC
[setQueryTimeout](https://jdbc.postgresql.org/documentation/publicapi/org/postgresql/jdbc/PgStatement.html#setQueryTimeout-int-)

nit: if you want to avoid these long lines you could use the markdown thing where you put links down at the bottom.


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 101 at r3 (raw file):

set to the SQLInstanceID. Note that SQLInstanceIDs are always positive, so it's
safe to use the leading bit in this way. This BackendKeyData is sent back to the
client.

it would be great if we could have a visualisation of how the 64 bit space is used (but i am a very visual person so up to you)


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 113 at r3 (raw file):

Since this endpoint is unauthenticated, before performing any business logic, it
will use a semaphore to guard the number of concurrent cancellation requests. If

are we penalised if the query just succeeded before the CANCEL QUERY request?
not saying it's too important, but i can imagine a crazy-ish case where say there's a blocked query, we issue 3000 cancels to all existing queries, blocked query gets cancelled, everything else succeeds but we still gotta wait 3000/concurrency seconds. very hypothetical and not important, just curious.


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r3 (raw file):

to the server.

When a proxy sees a BackendKeyData, it will generate a new random 32-bit secret

it would be great to get a sequence diagram of the protocol here

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong: with nit about clarifying semaphore semantics.

I'll let oliver finish iterating with you.

Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @knz, @otan, and @rafiss)


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 113 at r3 (raw file):

Previously, otan (Oliver Tan) wrote…

are we penalised if the query just succeeded before the CANCEL QUERY request?
not saying it's too important, but i can imagine a crazy-ish case where say there's a blocked query, we issue 3000 cancels to all existing queries, blocked query gets cancelled, everything else succeeds but we still gotta wait 3000/concurrency seconds. very hypothetical and not important, just curious.

good point. I see in the code it's a TryAcquire, not Acquire, but that's worth explaining here.

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @knz and @otan)


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 11 at r3 (raw file):

Previously, otan (Oliver Tan) wrote…

haha that's not a full name :')

:D


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 48 at r3 (raw file):

Previously, otan (Oliver Tan) wrote…

nit: if you want to avoid these long lines you could use the markdown thing where you put links down at the bottom.

eh i'm not a huge fan of that actually


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 101 at r3 (raw file):

Previously, otan (Oliver Tan) wrote…

it would be great if we could have a visualisation of how the 64 bit space is used (but i am a very visual person so up to you)

nice idea


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 113 at r3 (raw file):

Previously, knz (kena) wrote…

good point. I see in the code it's a TryAcquire, not Acquire, but that's worth explaining here.

adding more details


docs/RFCS/20220202_pgwire_compatible_cancel.md, line 138 at r3 (raw file):

Previously, otan (Oliver Tan) wrote…

it would be great to get a sequence diagram of the protocol here

will try to add something. let's see how my text diagram skills are

@rafiss
Copy link
Collaborator Author

rafiss commented Feb 9, 2022

thanks for the reviews all. I've added a few diagrams as suggested.

bors r=otan,knz

@craig
Copy link
Contributor

craig bot commented Feb 9, 2022

Build succeeded:

@craig craig bot merged commit c7446c5 into cockroachdb:master Feb 9, 2022
@rafiss rafiss deleted the rfc-pgwire-cancel branch February 10, 2022 03:33
craig bot pushed a commit that referenced this pull request Feb 10, 2022
67501: sql/pgwire: implement pgwire query cancellation  r=knz a=rafiss

closes #41335

See individual commits for more details.

This adds a BackendKeyData struct which identifies a session and SQLInstance
running that session. Clients send it to cancel a request, and it's handled by a new
CancelQueryByBackendKeyData endpoint in the status server. The cancellation
gets forwarded to the correct node, and there is a per-node rate limit to
prevent a DoS attack.

Also, refer to the RFC: #75870

This PR only adds support for non-multitenant clusters. Refer to the RFC for additional
work needed for serverless/multitenant support.


Co-authored-by: Rafi Shamim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants