Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: add timeouts to node status and node ls #20308

Merged
merged 7 commits into from
Nov 30, 2017

Conversation

tbg
Copy link
Member

@tbg tbg commented Nov 28, 2017

GRPC in its recent versions has become quite noisy and in particular does
not play well with context cancellation.

Mute it in all client commands except when actually starting a server.

Refactor the cli package to avoid the previously common mixing of stoppers
and contexts. Nobody really needs the stopper usually (and if so, they
could make their own); it is simpler to just have a context and cancel it
when done.

Touches #16489 (by making it more straightforward to use timeouts).

Fixes #20308.

@tbg tbg requested a review from a team as a code owner November 28, 2017 19:39
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg tbg force-pushed the cli/grpc-cleanup branch 2 times, most recently from 1ad7f8f to 57d229b Compare November 28, 2017 20:48
@tbg tbg changed the title cli: remove spurious GRPC warnings cli: add timeouts to node status and node ls Nov 28, 2017
@nvanbenschoten
Copy link
Member

Very nice cleanup getting rid of the stopper mess!


Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 7 of 7 files at r3, 6 of 6 files at r4.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


pkg/cli/error.go, line 67 at r4 (raw file):

		case *net.OpError:
			return connDropped()
		default:

nit: no need for the default cases here and below. I'd remove them from all three switches and move the return err out.


pkg/cli/init.go, line 60 at r3 (raw file):

}

func getInitClient() (serverpb.InitClient, *stop.Stopper, error) {

why delete this one but leave getStatusClient?


pkg/cli/start.go, line 949 at r3 (raw file):

	}
	stopper.AddCloser(stop.CloserFn(func() {
		_ = conn.Close()

How was this handled before? Did we just never close the connection?


pkg/cli/start.go, line 969 at r3 (raw file):

}

func stopperContext(stopper *stop.Stopper) context.Context {

Do we still need this function?


pkg/util/grpcutil/log.go, line 33 at r2 (raw file):

)

var discardLogger = grpclog.NewLoggerV2(ioutil.Discard, ioutil.Discard, ioutil.Discard)

I'd expect that we would at least want to print errors to stderr outside of starting a cluster. When you say that gRPC has become noisy, do you mean that it's printing spurious errors or warnings?


Comments from Reviewable

@tbg
Copy link
Member Author

tbg commented Nov 29, 2017

Review status: 1 of 17 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


pkg/cli/error.go, line 67 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: no need for the default cases here and below. I'd remove them from all three switches and move the return err out.

Done.


pkg/cli/init.go, line 60 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

why delete this one but leave getStatusClient?

It had two callers, but just inlined it there. Removed.


pkg/cli/start.go, line 949 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

How was this handled before? Did we just never close the connection?

That's what it seems like to me.


pkg/cli/start.go, line 969 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we still need this function?

Removed the remaining callers and put this method to sleep.


pkg/util/grpcutil/log.go, line 33 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'd expect that we would at least want to print errors to stderr outside of starting a cluster. When you say that gRPC has become noisy, do you mean that it's printing spurious errors or warnings?

Yeah, it prints a bunch of warnings. Some of this is because using context cancellation tears down everything eagerly, but I don't think it's just that. How do you like the latest iteration?


Comments from Reviewable

@nvanbenschoten
Copy link
Member

:lgtm:


Reviewed 1 of 2 files at r6, 6 of 7 files at r7, 6 of 6 files at r8, 2 of 2 files at r9, 5 of 5 files at r10.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


pkg/util/grpcutil/log.go, line 57 at r10 (raw file):

type logger log.Severity

func (severity *logger) Info(args ...interface{}) {

Do these receivers need to be pointers?


Comments from Reviewable

GRPC in its recent versions has become quite noisy and in particular does not play well
with context cancellation.

Mute it in all client commands except when actually starting a server.
Refactor the cli package to avoid the previously common mixing of stoppers and
contexts. Nobody really needs the stopper usually (and if so, they could make
their own); it is simpler to just have a context and cancel it when done.

Touches cockroachdb#16489 (by making it more straightforward to use timeouts).
Defaults to no timeout.

Both of these commands internally do a KV scan, which
may never finish on a sufficiently broken cluster.

Release note (cli change): the `node status` and `node ls` commands now support a timeout.

Updates cockroachdb#16489.
This avoids muddling up client output with noisy GRPC connection chatter,
but lease some amount of logging in place for hard errors (which I have
never seen in practice).
I refactored this to rely solely on context cancellation in
a previous commit. Unfortunately, we run cli tools multiple
times in the same process during tests, and we reset globals
that are picked up by the RPCContext used by open connections,
which results in data races.

Return a closure from getClientGRPCConn which waits for the
Stopper to stop to circumvent this.
@tbg
Copy link
Member Author

tbg commented Nov 30, 2017

Had to unfortunately reintroduce waiting for the stopper, see latest commit.


Review status: 1 of 18 files reviewed at latest revision, 2 unresolved discussions.


pkg/util/grpcutil/log.go, line 57 at r10 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do these receivers need to be pointers?

Yeah, see the comment above (synthetic stack frames).


Comments from Reviewable

@tbg tbg merged commit 602df28 into cockroachdb:master Nov 30, 2017
@tbg tbg deleted the cli/grpc-cleanup branch November 30, 2017 15:31
@tbg
Copy link
Member Author

tbg commented Nov 30, 2017

TFTR!

knz added a commit to knz/cockroach that referenced this pull request Oct 20, 2022
Prior to this patch, the code in `runStartSQL` was using
a different sequence of steps than in `runStart`, even
for those steps that are beneficial to run in both cases.

This commit fixes that. In particular it adds the following
missing bits to `cockroach mt start-sql`:

- it fixes support for the (test-only)
  `COCKROACH_EXPERIMENTAL_LINEARIZABLE` env var. (from cockroachdb#4754)

- it adds a tracing span for the startup code. (from cockroachdb#8712)

- it properly supports `--listening-url-file`. (from cockroachdb#15468)

- it properly does sanitization of `--external-io-dir`. (from cockroachdb#19725)

- it sets the proper log severity level for gRPC. (from cockroachdb#20308)

- it reports the command-line and env config to logs. (from cockroachdb#21344)

- it prevents startup if there is a `_CRITICAL_ALERT.txt` file
  in the store directory. (from cockroachdb#42401)

- sets the umask for newly created file to remove "other" permission
  bits. This was a security team request originally. (from cockroachdb#44043)

- it sets `GOMAXPROCS` from current cgroup limits. (from cockroachdb#57390)

- it stops the server early if the storage disk is full. (from cockroachdb#66893)

- it fixes support for the `COCKROACH_DISABLE_SPAN_CONFIGS` config env
  var. (from cockroachdb#73876)

To review this commit, it is useful to open the files
`start.go` and `mt_start_sql.go` side-by-side, which
clarifies how much closer they have become to each other.
Looking at the `mt_start_sql.go` changes without
that context likely makes the review more difficult.

A later commit will actually merge the two code paths together so
there is just one thing to maintain.

Release note: None
knz added a commit to knz/cockroach that referenced this pull request Oct 28, 2022
Prior to this patch, the code in `runStartSQL` was using
a different sequence of steps than in `runStart`, even
for those steps that are beneficial to run in both cases.

This commit fixes that. In particular it adds the following
missing bits to `cockroach mt start-sql`:

- it fixes support for the (test-only)
  `COCKROACH_EXPERIMENTAL_LINEARIZABLE` env var. (from cockroachdb#4754)

- it adds a tracing span for the startup code. (from cockroachdb#8712)

- it properly supports `--listening-url-file`. (from cockroachdb#15468)

- it properly does sanitization of `--external-io-dir`. (from cockroachdb#19725)

- it sets the proper log severity level for gRPC. (from cockroachdb#20308)

- it reports the command-line and env config to logs. (from cockroachdb#21344)

- it prevents startup if there is a `_CRITICAL_ALERT.txt` file
  in the store directory. (from cockroachdb#42401)

- sets the umask for newly created file to remove "other" permission
  bits. This was a security team request originally. (from cockroachdb#44043)

- it sets `GOMAXPROCS` from current cgroup limits. (from cockroachdb#57390)

- it stops the server early if the storage disk is full. (from cockroachdb#66893)

- it fixes support for the `COCKROACH_DISABLE_SPAN_CONFIGS` config env
  var. (from cockroachdb#73876)

To review this commit, it is useful to open the files
`start.go` and `mt_start_sql.go` side-by-side, which
clarifies how much closer they have become to each other.
Looking at the `mt_start_sql.go` changes without
that context likely makes the review more difficult.

A later commit will actually merge the two code paths together so
there is just one thing to maintain.

Release note: None
knz added a commit to knz/cockroach that referenced this pull request Oct 28, 2022
Prior to this patch, the code in `runStartSQL` was using
a different sequence of steps than in `runStart`, even
for those steps that are beneficial to run in both cases.

This commit fixes that. In particular it adds the following
missing bits to `cockroach mt start-sql`:

- it fixes support for the (test-only)
  `COCKROACH_EXPERIMENTAL_LINEARIZABLE` env var. (from cockroachdb#4754)

- it adds a tracing span for the startup code. (from cockroachdb#8712)

- it properly supports `--listening-url-file`. (from cockroachdb#15468)

- it properly does sanitization of `--external-io-dir`. (from cockroachdb#19725)

- it sets the proper log severity level for gRPC. (from cockroachdb#20308)

- it reports the command-line and env config to logs. (from cockroachdb#21344)

- it prevents startup if there is a `_CRITICAL_ALERT.txt` file
  in the store directory. (from cockroachdb#42401)

- sets the umask for newly created file to remove "other" permission
  bits. This was a security team request originally. (from cockroachdb#44043)

- it sets `GOMAXPROCS` from current cgroup limits. (from cockroachdb#57390)

- it stops the server early if the storage disk is full. (from cockroachdb#66893)

- it fixes support for the `COCKROACH_DISABLE_SPAN_CONFIGS` config env
  var. (from cockroachdb#73876)

To review this commit, it is useful to open the files
`start.go` and `mt_start_sql.go` side-by-side, which
clarifies how much closer they have become to each other.
Looking at the `mt_start_sql.go` changes without
that context likely makes the review more difficult.

A later commit will actually merge the two code paths together so
there is just one thing to maintain.

Release note: None
knz added a commit to knz/cockroach that referenced this pull request Oct 31, 2022
Prior to this patch, the code in `runStartSQL` was using
a different sequence of steps than in `runStart`, even
for those steps that are beneficial to run in both cases.

This commit fixes that. In particular it adds the following
missing bits to `cockroach mt start-sql`:

- it fixes support for the (test-only)
  `COCKROACH_EXPERIMENTAL_LINEARIZABLE` env var. (from cockroachdb#4754)

- it adds a tracing span for the startup code. (from cockroachdb#8712)

- it properly supports `--listening-url-file`. (from cockroachdb#15468)

- it properly does sanitization of `--external-io-dir`. (from cockroachdb#19725)

- it sets the proper log severity level for gRPC. (from cockroachdb#20308)

- it reports the command-line and env config to logs. (from cockroachdb#21344)

- it prevents startup if there is a `_CRITICAL_ALERT.txt` file
  in the store directory. (from cockroachdb#42401)

- sets the umask for newly created file to remove "other" permission
  bits. This was a security team request originally. (from cockroachdb#44043)

- it sets `GOMAXPROCS` from current cgroup limits. (from cockroachdb#57390)

- it stops the server early if the storage disk is full. (from cockroachdb#66893)

- it fixes support for the `COCKROACH_DISABLE_SPAN_CONFIGS` config env
  var. (from cockroachdb#73876)

To review this commit, it is useful to open the files
`start.go` and `mt_start_sql.go` side-by-side, which
clarifies how much closer they have become to each other.
Looking at the `mt_start_sql.go` changes without
that context likely makes the review more difficult.

A later commit will actually merge the two code paths together so
there is just one thing to maintain.

Release note: None
craig bot pushed a commit that referenced this pull request Oct 31, 2022
90176: cli/start: unify code between `cockroach start` and `cockroach mt start-sql` r=andreimatei a=knz

Fixes #89974.
Fixes #90831.
Fixes #90524.

This PR merges the server initialization code between `cockroach start` and `cockroach mt start-sql`.

In doing so, it brings `cockroach mt start-sql` closer to what we expect from proper CockroachDB server processes:

- it fixes support for the (test-only)
  `COCKROACH_EXPERIMENTAL_LINEARIZABLE` env var. (from #4754)

- it adds a tracing span for the startup code. (from #8712!!)

- it properly supports `--listening-url-file`. (from #15468)

- it properly does sanitization of `--external-io-dir`. (from #19725)

- it sets the proper log severity level for gRPC. (from #20308)

- it reports the command-line and env config to logs. (from #21344)

- it prevents startup if there is a `_CRITICAL_ALERT.txt` file
  in the store directory. (from #42401)

- sets the umask for newly created file to remove "other" permission
  bits. This was a security team request originally. (from #44043)

- it recovers support for `DISABLE_STARTING_BACKGROUND_JOBS`. (from #44786)

- it sets `GOMAXPROCS` from current cgroup limits. (from #57390)

- it stops the server early if the storage disk is full. (from #66893)

- it fixes support for the `COCKROACH_DISABLE_SPAN_CONFIGS` config env
  var. (from #73876)

See the individual commit for details.



90660: sql: add contention_events to cluster_execution_insights r=j82w a=j82w

The original contention column will remain
to make query operations faster. The events
are being put into a json column because it's
possible there could be multiple blocking events
for a single statement. The json column avoids the complexity of adding another table and keeping it
in sync with the insights table.

The table can be joined with index_columns and tables to get the database name, table name, and index name that contention occurred on. This does not contain the blocking statement information, and the blocking fingerprint id.

closes: #88561

Release note (sql change): Adds contention_events
to cluster_execution_insights. This is used
to see which transaction is blocking the specific
statement.

90719: opgen: added a bool field in struct opgen.transition r=Xiang-Gu a=Xiang-Gu

This PR adds a bool field in struct opgen.transition that indicates whether it results from a `equiv(xx)` transition spec in the opgen file. It will be useful for a test where we need to find the inital status on a adding/dropping path. Without such a change, it can be problematic if we have a `equiv(xx)` spec as the first transition. E.g.
```
  ToAbsent(
    PUBLIC,
    equiv(VALIDATED),
    to(WRITE_ONLY),
    to(ABSENT),
  )
```
Without this change, the inital status will confusingly be `VALIDATED`, and the next status will be `PUBLIC`.
With this change, the initial status will be `PUBLIC`, and the next status will be `WRITE_ONLY`.

We also added some comments when we make transitions from the specs.

Epic: None
Release note: None

90865: sql: use bare name string of new pk  to compare with pk name when altering primary key r=chengxiong-ruan a=chengxiong-ruan

Fixes #90836

Release note (sql change): previously, the `DROP CONSTRAINT, ADD CONSTRAINT` in one trick to have a new primary key without moving old primary key to be a secondary index didn't work if the primary key name is a reserved SQL keyword. A `constraint already exists` error was returned. This patch fixed the bug, the trick now also works with primary key named as reserved keywords.

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: j82w <[email protected]>
Co-authored-by: Xiang Gu <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants