Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli,storage: add emergency ballast #66893

Merged
merged 1 commit into from
Aug 19, 2021
Merged

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Jun 25, 2021

Add an automatically created, on-by-default emergency ballast file. This
new ballast defaults to the minimum of 1% total disk capacity or 1GiB.
The size of the ballast may be configured via the --store flag with a
ballast-size field, accepting the same value formats as the size
field.

The ballast is automatically created when either available disk space is
at least four times the ballast size, or when available disk space after
creating the ballast is at least 10 GiB. Creation of the ballast happens
either when the engine is opened or during the periodic Capacity
calculations driven by the kvserver.Store.

During node start, if available disk space is less than or equal to half
the ballast size, exit immediately with a new Disk Full (10) exit code.

See #66493.

Release note (ops change): Add an automatically created, on by default
emergency ballast file. This new ballast defaults to the minimum of 1%
total disk capacity or 1GiB. The size of the ballast may be configured
via the --store flag with a ballast-size field, accepting the same
value formats as the size field. Also, add a new Disk Full (10) exit
code that indicates that the node exited because disk space on at least
one store is exhausted. On node start, if any store has less than half
the ballast's size bytes available, the node immediately exits with the
Disk Full (10) exit code. The operator may manually remove the
configured ballast (assuming they haven't already) to allow the node to
start, and they can take action to remedy the disk space exhaustion. The
ballast will automatically be recreated when available disk space is 4x
the ballast size, or at least 10 GiB is available after the ballast is
created.

@jbowens jbowens requested a review from a team as a code owner June 25, 2021 18:33
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jbowens jbowens force-pushed the jackson/ballast branch 2 times, most recently from 278105f to 5b4e386 Compare June 25, 2021 18:56
@jbowens jbowens requested review from joshimhoff and tbg June 25, 2021 19:48
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this up Jackson! This looks good to me (I did have some questions and UX comments inline), but I didn't do a nitty-gritty line-by-line review. I don't think Josh will do that either; mind getting a reviewer from storage as well for that?

Reviewed 7 of 7 files at r1, 2 of 2 files at r2, 10 of 10 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens and @joshimhoff)


pkg/base/store_spec.go, line 329 at r3 (raw file):

		case "ballast-size":
			var minBytesAllowed int64
			var minPercent float64 = 1

aren't 0.1% etc also allowed?


pkg/cli/start.go, line 341 at r3 (raw file):

			return err
		} else if isDiskFull {
			err := errors.Errorf("store %s: out of disk space", spec.Path)

nit: you could collect all of the out-of-disk specs across stores, in case multiple are out of space. You could also hint people at the existence of the ballast file or, in the future, at docs on the topic.


pkg/storage/ballast.go, line 36 at r3 (raw file):

	desiredSizeBytes := BallastSizeBytes(spec, diskUsage)

	ballastPath := base.EmergencyBallastFile(fs.PathJoin, spec.Path)

are these all stable under different ways of starting the same node? For example, I might start it with --store=./data one day, and with --store=/foo/bar/data the next.

We should also note on EmergencyBallastFile that the constants within cannot be changed, or existing ballast files will start getting included in disk usage & new ballast files will be created. More things that would be useful:

  1. give devs a way to opt-out of ballast files via an env var. I don't want a 1GB file to be created every time I ./cockroach start-single- node.
  2. similarly, need to make sure that cockroach demo doesn't ever do ballast files (I think that's always in-mem, so probably already true)
  3. it might be worth putting a .txt file next to EmergencyBallastFile that has some information about ballast and how to disable it if undesirable.

pkg/storage/ballast_test.go, line 218 at r3 (raw file):

		path, cleanup := setup(t, 4096)
		defer cleanup()
		resized, err := maybeEstablishBallast(vfs.Default, path, 4096, vfs.DiskUsage{

From your experience, are disk capacities static? For example, if a disk reports a capacity of 10TB + rand(0,100) bytes, will that lead to the ballast file being recreated ~every time (which is probably mildly expensive, or very expensive when the file actually has to be written wholesale)?

Copy link
Collaborator

@joshimhoff joshimhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also thanks for putting me on the PR. It's really nice to work with y'all directly on solving various problems that have an operational angle / affect oncall life in a big way. That is, DEVOPS (in the deep platonic sense (no buzzwords allowed)).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens)


pkg/storage/ballast.go, line 25 at r3 (raw file):

// IsDiskFull examines the store indicated by spec, determining whether the
// store's underlying disk is out of disk space. A disk is considered to be
// full if available capacity is less than half of the store's ballast size.

Is one of the goals of this PR that after a single crash, all subsequent crashes will have the out of disk exit code, as CRDB will determine that the disk is full by calling this function? What is our confidence that in an out of disk scenario, we will exit with the new code repeatedly? In asking this Q, I am thinking about the possibly of an attempt to write to some log file failing before this check is made, as has happened in CC land (as we use datadir disk for logging currently). Can you say why our confidence is what it is for my learning also? I can imagine deleting the ballast automatically on startup when needed, but continuing to crash as the available disk is ~equal to the ballast, but perhaps that is not necessary as we have the confidence we need already, just for reasons I don't follow yet.

Also, what about the first crash? This will happen in the <=21.1 way, that is, somewhere a write to disk will fail leading to a crash, and we can't say much about the exit code, as it depends where that happens? If that is right, do we generally expect a 100% full disk, that is, no available capacity, rather than a disk that has around 1/2 ballast worth of capacity? In a way this relates to my first Q.

@jbowens jbowens force-pushed the jackson/ballast branch 2 times, most recently from 0a8dc02 to 95a9a3a Compare June 28, 2021 15:59
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind getting a reviewer from storage as well for that?

For sure. I think maybe we should see whether the exit code approach is workable within CC alerting, but then I'll tag storage folks for review.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens, @joshimhoff, and @tbg)


pkg/base/store_spec.go, line 329 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

aren't 0.1% etc also allowed?

Good call, lifted the minimum requirement altogether.


pkg/storage/ballast.go, line 25 at r3 (raw file):

Is one of the goals of this PR that after a single crash, all subsequent crashes will have the out of disk exit code, as CRDB will determine that the disk is full by calling this function? What is our confidence that in an out of disk scenario, we will exit with the new code repeatedly? In asking this Q, I am thinking about the possibly of an attempt to write to some log file failing before this check is made, as has happened in CC land (as we use datadir disk for logging currently). Can you say why our confidence is what it is for my learning also?

Yeah, we want all subsequent exits to have the disk full exit code. I'm pretty confident that we will always exit with the new code, so long as some other process isn't causing disk usage to fluctuate rapidly. The call to this function happens very early in the node start path, even before we initialize logging. Before we call IsDiskFull, we parse command flags, maybe re-execute the process in the background if requested, set a umask and set up signal handlers. That's it.

I can imagine deleting the ballast automatically on startup when needed, but continuing to crash as the available disk is ~equal to the ballast, but perhaps that is not necessary as we have the confidence we need already, just for reasons I don't follow yet.

In a non-CC setting, we want the user to be able to start up with the newly created ballast headroom and try to recover the node/cluster. Without a separate file or flag to indicate that intent, we need the user to delete the ballast so that the increased available capacity can be the signal.

Also, what about the first crash? This will happen in the <=21.1 way, that is, somewhere a write to disk will fail leading to a crash, and we can't say much about the exit code, as it depends where that happens?

I expect the vast majority of first crashes to exit with the Disk Full error code, because of a wrapper around the virtual file system used by Pebble and most filesystem writes within CockroachDB. There are definitely some codepaths that directly use the standard library os package and an ENOSPC in those codepaths potentially could fatal the process with a different exit code. As far as I know, these codepaths are just things like:

  • TLS certificate rotation
  • writing to external/nodelocal storage (eg, when writing backups to the node's local disk for collecting later; not relevant to CC)
  • writing CPU and heap profiles, goroutine dumps, debug.zips
  • unfortunately, resizing the ballast itself, for now

If that is right, do we generally expect a 100% full disk, that is, no available capacity, rather than a disk that has around 1/2 ballast worth of capacity? In a way this relates to my first Q.

We generally expect an available capacity that is just shy of 0. There's some amount of variation in the low-level block allocation requests made by the filesystem, but as far as I know, not a lot. I think there may be more variation on copy-on-write filesystems like btrfs or zfs, but we don't need to worry about that in CC.


pkg/storage/ballast.go, line 36 at r3 (raw file):

are these all stable under different ways of starting the same node? For example, I might start it with --store=./data one day, and with --store=/foo/bar/data the next.

Yeah, we convert to the store path to an absolute path very early when we construct the StoreSpec. We don't have a guarantee that the store directory will exist though, so I updated this to account for that.

  1. give devs a way to opt-out of ballast files via an env var. I don't want a 1GB file to be created every time I ./cockroach start-single- node.

Good point. I'll follow up with this in a bit.

  1. similarly, need to make sure that cockroach demo doesn't ever do ballast files (I think that's always in-mem, so probably already true)

I'll double check this.

  1. it might be worth putting a .txt file next to EmergencyBallastFile that has some information about ballast and how to disable it if undesirable.

Good call.


pkg/storage/ballast_test.go, line 218 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

From your experience, are disk capacities static? For example, if a disk reports a capacity of 10TB + rand(0,100) bytes, will that lead to the ballast file being recreated ~every time (which is probably mildly expensive, or very expensive when the file actually has to be written wholesale)?

This is a good question — I do expect it to be static, but it's worth verifying that experimentally.

@joshimhoff joshimhoff requested a review from tbg June 28, 2021 16:18
Copy link
Collaborator

@joshimhoff joshimhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens and @tbg)


pkg/storage/ballast.go, line 25 at r3 (raw file):
Ack! Thanks for explaining. I might suggest that some of above thinking should be in comments in the code, perhaps in start.go? I also wonder if we should / can system test the consistent exit code when out of disk, so we don't lose the property down the line?

I will check this out today / tomorrow in the AM; I think it will work; I just want to be paranoid about it.

For sure. I think maybe we should see whether the exit code approach is workable within CC alerting

@jbowens jbowens force-pushed the jackson/ballast branch 2 times, most recently from 12fc9b2 to 015c171 Compare June 28, 2021 19:13
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @joshimhoff and @tbg)


pkg/cli/start.go, line 341 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

nit: you could collect all of the out-of-disk specs across stores, in case multiple are out of space. You could also hint people at the existence of the ballast file or, in the future, at docs on the topic.

good call, done.


pkg/storage/ballast.go, line 25 at r3 (raw file):
added some comments!

I will check this out today / tomorrow in the AM; I think it will work; I just want to be paranoid about it.

awesome, thank you!

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. just a few nits

Reviewed 7 of 7 files at r1, 2 of 2 files at r2, 3 of 10 files at r3, 2 of 6 files at r4, 6 of 6 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens, @joshimhoff, and @tbg)


pkg/base/store_spec.go, line 82 at r5 (raw file):

// SizeSpec if it is correctly parsed.
func NewSizeSpec(
	field, value string, bytesRange *intInterval, percentRange *floatInterval,

maybe make the field argument a redact.SafeString


pkg/base/store_spec.go, line 96 at r5 (raw file):

		size.Percent *= percentFactor
		if err != nil {
			return SizeSpec{}, fmt.Errorf("could not parse %s size (%s) %s", field, value, err)

then use errors.Newf here and below


pkg/base/store_spec.go, line 154 at r5 (raw file):

// pflag's value interface.
func (ss *SizeSpec) Set(value string) error {
	spec, err := NewSizeSpec("", value, nil, nil)

change that empty string to "specified" so that the error messages, if any, check out with the flag UX.


pkg/base/store_spec.go, line 194 at r5 (raw file):

// String returns a fully parsable version of the store spec.
func (ss StoreSpec) String() string {

All this would benefit from being turned into a SafeFormat.


pkg/cli/start.go, line 1037 at r5 (raw file):

	}
	ballastPathsStr := strings.Join(ballastPaths, "\n")
	err = errors.WithDetailf(err, `Deleting or truncating the ballast file(s) at

nit: might be a hint more than a detail


pkg/storage/ballast.go, line 25 at r3 (raw file):

Previously, jbowens (Jackson Owens) wrote…

added some comments!

I will check this out today / tomorrow in the AM; I think it will work; I just want to be paranoid about it.

awesome, thank you!

Can you also explain the above in the release note thanks. Release notes are the communication channel by which we send information to docs.

Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @joshimhoff, @knz, and @tbg)


pkg/base/store_spec.go, line 82 at r5 (raw file):

Previously, knz (kena) wrote…

maybe make the field argument a redact.SafeString

Done.


pkg/base/store_spec.go, line 96 at r5 (raw file):

Previously, knz (kena) wrote…

then use errors.Newf here and below

Done.


pkg/base/store_spec.go, line 154 at r5 (raw file):

Previously, knz (kena) wrote…

change that empty string to "specified" so that the error messages, if any, check out with the flag UX.

Done.


pkg/base/store_spec.go, line 194 at r5 (raw file):

Previously, knz (kena) wrote…

All this would benefit from being turned into a SafeFormat.

Added a TODO


pkg/cli/start.go, line 1037 at r5 (raw file):

Previously, knz (kena) wrote…

nit: might be a hint more than a detail

Done.


pkg/storage/ballast.go, line 25 at r3 (raw file):

Previously, knz (kena) wrote…

Can you also explain the above in the release note thanks. Release notes are the communication channel by which we send information to docs.

Done.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 10 files at r3, 1 of 6 files at r4, 2 of 3 files at r6.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick, @itsbilal, @jbowens, @joshimhoff, @knz, and @tbg)


pkg/storage/ballast.go, line 28 at r7 (raw file):

// may want to include `COCKROACH_AUTO_BALLAST=false` in their environment to
// prevent the automatic creation of large ballast files on their local
// filesystem.

Have we already taught our on-prem users to create a ballast file using https://www.cockroachlabs.com/docs/stable/cockroach-debug-ballast.html?
If yes, I assume that the default here would create another one since the existing one could have an arbitrary path that we don't know about.
I can see why we would want to default to true, and possibly all we need is sufficient release documentation -- what I worry about is their being 2 ballast files and no one noticing.


pkg/storage/ballast.go, line 64 at r7 (raw file):

	}

	// If the ballast is larger than desired, truncate it now in case the

what if we need to increase it in size? Is there a reason why that would only happen via the Capacity periodic logic and not here?


pkg/storage/ballast.go, line 89 at r7 (raw file):

// BallastSizeBytes returns the desired size of the emergency ballast,
// calculated from the provided store spec and disk usage. If the store spec
// contains an explicit ballast size (either in bytes or as a perecentage of

percentage


pkg/storage/ballast.go, line 141 at r7 (raw file):

		// If the user configured a really large ballast, we might not ever
		// have >= 4x the required amount available. Also allow extending the
		// ballast if we will have 10 GiB available after the extension.

What is the motivation for this 10GiB logic? Should we log an error instead of potentially taking up too much space? I can see why that would not be satisfactory since no one may notice. A code comment could help here.


pkg/storage/pebble.go, line 592 at r7 (raw file):

	// store's capacity is queried periodically. Only try if the store is not
	// an in-memory store. vfs.MemFS will error if you retrieve disk usage.
	if cfg.Dir != "" /* if not in-memory */ {

Is this a reliable way of detecting "not in-memory"?


pkg/storage/pebble.go, line 1027 at r7 (raw file):

	fsuTotal := int64(du.TotalBytes)
	fsuAvail := int64(du.AvailBytes)

Can you add a code comment about Capacity being called periodically.

@tbg tbg removed their request for review July 1, 2021 07:56
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick, @itsbilal, @joshimhoff, @knz, @sumeerbhola, and @tbg)


pkg/storage/ballast.go, line 28 at r7 (raw file):

Previously, sumeerbhola wrote…

Have we already taught our on-prem users to create a ballast file using https://www.cockroachlabs.com/docs/stable/cockroach-debug-ballast.html?
If yes, I assume that the default here would create another one since the existing one could have an arbitrary path that we don't know about.
I can see why we would want to default to true, and possibly all we need is sufficient release documentation -- what I worry about is their being 2 ballast files and no one noticing.

It is featured somewhat prominently on the production checklist docs. I pinged customer success to see if anyone has a sense of how widespread manually-created ballasts are. This is definitely something that needs to be well-documented in the release docs. I'm not sure what else we can do outside that.


pkg/storage/ballast.go, line 64 at r7 (raw file):

Previously, sumeerbhola wrote…

what if we need to increase it in size? Is there a reason why that would only happen via the Capacity periodic logic and not here?

There's just the complication that the data directory might not exist yet. I think it is preferable to establish the ballast here too if available capacity allows. I updated maybeEstablishBallast, adding a MkdirAll of the data directory before attempting to create/grow the ballast.


pkg/storage/ballast.go, line 89 at r7 (raw file):

Previously, sumeerbhola wrote…

percentage

Done.


pkg/storage/ballast.go, line 141 at r7 (raw file):

Previously, sumeerbhola wrote…

What is the motivation for this 10GiB logic? Should we log an error instead of potentially taking up too much space? I can see why that would not be satisfactory since no one may notice. A code comment could help here.

The 10 GiB logic exists for users that want the safety of a large ballast. Larger ballast sizes (eg, 5%, 10%) are not unreasonably large, but it's possible that available capacity will never exceed 4x the ballast sizes (eg, 20%, 40%) after recovery. That's especially true on larger clusters where you can use a higher percentage of each individual disk's capacity because the cluster as a whole still has plenty of capacity for rehoming replicas if a node is lost. Added a little more of this explanation as a code comment.


pkg/storage/pebble.go, line 592 at r7 (raw file):

Previously, sumeerbhola wrote…

Is this a reliable way of detecting "not in-memory"?

Yeah, it's the check Capacity makes, and the check (*Pebble).InMem() makes.


pkg/storage/pebble.go, line 1027 at r7 (raw file):

Previously, sumeerbhola wrote…

Can you add a code comment about Capacity being called periodically.

Done.

@jbowens jbowens force-pushed the jackson/ballast branch 2 times, most recently from 28d8af4 to 9ae7f2b Compare July 16, 2021 20:40
@jbowens jbowens force-pushed the jackson/ballast branch 2 times, most recently from 881da2d to cfa19be Compare August 4, 2021 16:51
@jbowens jbowens requested a review from a team as a code owner August 4, 2021 16:51
@jbowens jbowens force-pushed the jackson/ballast branch 2 times, most recently from 8149787 to 7746b7b Compare August 12, 2021 18:30
@jbowens jbowens requested a review from a team as a code owner August 12, 2021 18:30
@jbowens jbowens force-pushed the jackson/ballast branch 2 times, most recently from 951addb to 783ed0b Compare August 13, 2021 21:56
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 10 files at r3, 1 of 6 files at r5, 9 of 14 files at r8.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @bananabrick, @itsbilal, @jbowens, @joshimhoff, @knz, @sumeerbhola, and @tbg)


pkg/cli/start.go, line 1040 at r8 (raw file):

			continue
		}
		ballastPaths = append(ballastPaths, base.EmergencyBallastFile(vfs.Default.PathJoin, spec.Path))

how about confirming that this file actually exists and classifying into two categories of errors: can or cannot be rectified by deleting the ballast file. That may be more helpful to an operator who is trying to deal with an emergency situation.


pkg/storage/ballast.go, line 58 at r8 (raw file):

	// Try to resize the ballast now, if necessary. This is necessary to
	// truncate the ballast if it a new, lower ballast size was provided, and

... if a new ...

@jbowens jbowens force-pushed the jackson/ballast branch 4 times, most recently from 6e679be to de924da Compare August 19, 2021 13:44
Add an automatically created, on-by-default emergency ballast file. This
new ballast defaults to the minimum of 1% total disk capacity or 1GiB.
The size of the ballast may be configured via the `--store` flag with a
`ballast-size` field, accepting the same value formats as the `size`
field.

The ballast is automatically created when either available disk space is
at least four times the ballast size or when available disk space after
creating the ballast is at least 10 GiB. Creation of the ballast happens
either when the engine is opened or during the periodic Capacity
calculations driven by the `kvserver.Store`.

During node start, if available disk space is less than or equal to half
the ballast size, exit immediately with a new Disk Full (10) exit code.

See cockroachdb#66493.

Release note (ops change): Add an automatically created, on by default
emergency ballast file. This new ballast defaults to the minimum of 1%
total disk capacity or 1GiB.  The size of the ballast may be configured
via the `--store` flag with a `ballast-size` field, accepting the same
value formats as the `size` field. Also, add a new Disk Full (10) exit
code that indicates that the node exited because disk space on at least
one store is exhausted. On node start, if any store has less than half
the ballast's size bytes available, the node immediately exits with the
Disk Full (10) exit code. The operator may manually remove the
configured ballast (assuming they haven't already) to allow the node to
start, and they can take action to remedy the disk space exhaustion. The
ballast will automatically be recreated when available disk space is 4x
the ballast size, or at least 10 GiB is available after the ballast is
created.
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @bananabrick, @itsbilal, @joshimhoff, @knz, @sumeerbhola, and @tbg)


pkg/cli/start.go, line 1040 at r8 (raw file):

Previously, sumeerbhola wrote…

how about confirming that this file actually exists and classifying into two categories of errors: can or cannot be rectified by deleting the ballast file. That may be more helpful to an operator who is trying to deal with an emergency situation.

Done.


pkg/storage/ballast.go, line 58 at r8 (raw file):

Previously, sumeerbhola wrote…

... if a new ...

Done.

@craig
Copy link
Contributor

craig bot commented Aug 19, 2021

Build succeeded:

@craig craig bot merged commit 24036ac into cockroachdb:master Aug 19, 2021
@jbowens jbowens deleted the jackson/ballast branch August 19, 2021 21:21
knz added a commit to knz/cockroach that referenced this pull request Oct 20, 2022
Prior to this patch, the code in `runStartSQL` was using
a different sequence of steps than in `runStart`, even
for those steps that are beneficial to run in both cases.

This commit fixes that. In particular it adds the following
missing bits to `cockroach mt start-sql`:

- it fixes support for the (test-only)
  `COCKROACH_EXPERIMENTAL_LINEARIZABLE` env var. (from cockroachdb#4754)

- it adds a tracing span for the startup code. (from cockroachdb#8712)

- it properly supports `--listening-url-file`. (from cockroachdb#15468)

- it properly does sanitization of `--external-io-dir`. (from cockroachdb#19725)

- it sets the proper log severity level for gRPC. (from cockroachdb#20308)

- it reports the command-line and env config to logs. (from cockroachdb#21344)

- it prevents startup if there is a `_CRITICAL_ALERT.txt` file
  in the store directory. (from cockroachdb#42401)

- sets the umask for newly created file to remove "other" permission
  bits. This was a security team request originally. (from cockroachdb#44043)

- it sets `GOMAXPROCS` from current cgroup limits. (from cockroachdb#57390)

- it stops the server early if the storage disk is full. (from cockroachdb#66893)

- it fixes support for the `COCKROACH_DISABLE_SPAN_CONFIGS` config env
  var. (from cockroachdb#73876)

To review this commit, it is useful to open the files
`start.go` and `mt_start_sql.go` side-by-side, which
clarifies how much closer they have become to each other.
Looking at the `mt_start_sql.go` changes without
that context likely makes the review more difficult.

A later commit will actually merge the two code paths together so
there is just one thing to maintain.

Release note: None
knz added a commit to knz/cockroach that referenced this pull request Oct 28, 2022
Prior to this patch, the code in `runStartSQL` was using
a different sequence of steps than in `runStart`, even
for those steps that are beneficial to run in both cases.

This commit fixes that. In particular it adds the following
missing bits to `cockroach mt start-sql`:

- it fixes support for the (test-only)
  `COCKROACH_EXPERIMENTAL_LINEARIZABLE` env var. (from cockroachdb#4754)

- it adds a tracing span for the startup code. (from cockroachdb#8712)

- it properly supports `--listening-url-file`. (from cockroachdb#15468)

- it properly does sanitization of `--external-io-dir`. (from cockroachdb#19725)

- it sets the proper log severity level for gRPC. (from cockroachdb#20308)

- it reports the command-line and env config to logs. (from cockroachdb#21344)

- it prevents startup if there is a `_CRITICAL_ALERT.txt` file
  in the store directory. (from cockroachdb#42401)

- sets the umask for newly created file to remove "other" permission
  bits. This was a security team request originally. (from cockroachdb#44043)

- it sets `GOMAXPROCS` from current cgroup limits. (from cockroachdb#57390)

- it stops the server early if the storage disk is full. (from cockroachdb#66893)

- it fixes support for the `COCKROACH_DISABLE_SPAN_CONFIGS` config env
  var. (from cockroachdb#73876)

To review this commit, it is useful to open the files
`start.go` and `mt_start_sql.go` side-by-side, which
clarifies how much closer they have become to each other.
Looking at the `mt_start_sql.go` changes without
that context likely makes the review more difficult.

A later commit will actually merge the two code paths together so
there is just one thing to maintain.

Release note: None
knz added a commit to knz/cockroach that referenced this pull request Oct 28, 2022
Prior to this patch, the code in `runStartSQL` was using
a different sequence of steps than in `runStart`, even
for those steps that are beneficial to run in both cases.

This commit fixes that. In particular it adds the following
missing bits to `cockroach mt start-sql`:

- it fixes support for the (test-only)
  `COCKROACH_EXPERIMENTAL_LINEARIZABLE` env var. (from cockroachdb#4754)

- it adds a tracing span for the startup code. (from cockroachdb#8712)

- it properly supports `--listening-url-file`. (from cockroachdb#15468)

- it properly does sanitization of `--external-io-dir`. (from cockroachdb#19725)

- it sets the proper log severity level for gRPC. (from cockroachdb#20308)

- it reports the command-line and env config to logs. (from cockroachdb#21344)

- it prevents startup if there is a `_CRITICAL_ALERT.txt` file
  in the store directory. (from cockroachdb#42401)

- sets the umask for newly created file to remove "other" permission
  bits. This was a security team request originally. (from cockroachdb#44043)

- it sets `GOMAXPROCS` from current cgroup limits. (from cockroachdb#57390)

- it stops the server early if the storage disk is full. (from cockroachdb#66893)

- it fixes support for the `COCKROACH_DISABLE_SPAN_CONFIGS` config env
  var. (from cockroachdb#73876)

To review this commit, it is useful to open the files
`start.go` and `mt_start_sql.go` side-by-side, which
clarifies how much closer they have become to each other.
Looking at the `mt_start_sql.go` changes without
that context likely makes the review more difficult.

A later commit will actually merge the two code paths together so
there is just one thing to maintain.

Release note: None
knz added a commit to knz/cockroach that referenced this pull request Oct 31, 2022
Prior to this patch, the code in `runStartSQL` was using
a different sequence of steps than in `runStart`, even
for those steps that are beneficial to run in both cases.

This commit fixes that. In particular it adds the following
missing bits to `cockroach mt start-sql`:

- it fixes support for the (test-only)
  `COCKROACH_EXPERIMENTAL_LINEARIZABLE` env var. (from cockroachdb#4754)

- it adds a tracing span for the startup code. (from cockroachdb#8712)

- it properly supports `--listening-url-file`. (from cockroachdb#15468)

- it properly does sanitization of `--external-io-dir`. (from cockroachdb#19725)

- it sets the proper log severity level for gRPC. (from cockroachdb#20308)

- it reports the command-line and env config to logs. (from cockroachdb#21344)

- it prevents startup if there is a `_CRITICAL_ALERT.txt` file
  in the store directory. (from cockroachdb#42401)

- sets the umask for newly created file to remove "other" permission
  bits. This was a security team request originally. (from cockroachdb#44043)

- it sets `GOMAXPROCS` from current cgroup limits. (from cockroachdb#57390)

- it stops the server early if the storage disk is full. (from cockroachdb#66893)

- it fixes support for the `COCKROACH_DISABLE_SPAN_CONFIGS` config env
  var. (from cockroachdb#73876)

To review this commit, it is useful to open the files
`start.go` and `mt_start_sql.go` side-by-side, which
clarifies how much closer they have become to each other.
Looking at the `mt_start_sql.go` changes without
that context likely makes the review more difficult.

A later commit will actually merge the two code paths together so
there is just one thing to maintain.

Release note: None
craig bot pushed a commit that referenced this pull request Oct 31, 2022
90176: cli/start: unify code between `cockroach start` and `cockroach mt start-sql` r=andreimatei a=knz

Fixes #89974.
Fixes #90831.
Fixes #90524.

This PR merges the server initialization code between `cockroach start` and `cockroach mt start-sql`.

In doing so, it brings `cockroach mt start-sql` closer to what we expect from proper CockroachDB server processes:

- it fixes support for the (test-only)
  `COCKROACH_EXPERIMENTAL_LINEARIZABLE` env var. (from #4754)

- it adds a tracing span for the startup code. (from #8712!!)

- it properly supports `--listening-url-file`. (from #15468)

- it properly does sanitization of `--external-io-dir`. (from #19725)

- it sets the proper log severity level for gRPC. (from #20308)

- it reports the command-line and env config to logs. (from #21344)

- it prevents startup if there is a `_CRITICAL_ALERT.txt` file
  in the store directory. (from #42401)

- sets the umask for newly created file to remove "other" permission
  bits. This was a security team request originally. (from #44043)

- it recovers support for `DISABLE_STARTING_BACKGROUND_JOBS`. (from #44786)

- it sets `GOMAXPROCS` from current cgroup limits. (from #57390)

- it stops the server early if the storage disk is full. (from #66893)

- it fixes support for the `COCKROACH_DISABLE_SPAN_CONFIGS` config env
  var. (from #73876)

See the individual commit for details.



90660: sql: add contention_events to cluster_execution_insights r=j82w a=j82w

The original contention column will remain
to make query operations faster. The events
are being put into a json column because it's
possible there could be multiple blocking events
for a single statement. The json column avoids the complexity of adding another table and keeping it
in sync with the insights table.

The table can be joined with index_columns and tables to get the database name, table name, and index name that contention occurred on. This does not contain the blocking statement information, and the blocking fingerprint id.

closes: #88561

Release note (sql change): Adds contention_events
to cluster_execution_insights. This is used
to see which transaction is blocking the specific
statement.

90719: opgen: added a bool field in struct opgen.transition r=Xiang-Gu a=Xiang-Gu

This PR adds a bool field in struct opgen.transition that indicates whether it results from a `equiv(xx)` transition spec in the opgen file. It will be useful for a test where we need to find the inital status on a adding/dropping path. Without such a change, it can be problematic if we have a `equiv(xx)` spec as the first transition. E.g.
```
  ToAbsent(
    PUBLIC,
    equiv(VALIDATED),
    to(WRITE_ONLY),
    to(ABSENT),
  )
```
Without this change, the inital status will confusingly be `VALIDATED`, and the next status will be `PUBLIC`.
With this change, the initial status will be `PUBLIC`, and the next status will be `WRITE_ONLY`.

We also added some comments when we make transitions from the specs.

Epic: None
Release note: None

90865: sql: use bare name string of new pk  to compare with pk name when altering primary key r=chengxiong-ruan a=chengxiong-ruan

Fixes #90836

Release note (sql change): previously, the `DROP CONSTRAINT, ADD CONSTRAINT` in one trick to have a new primary key without moving old primary key to be a secondary index didn't work if the primary key name is a reserved SQL keyword. A `constraint already exists` error was returned. This patch fixed the bug, the trick now also works with primary key named as reserved keywords.

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: j82w <[email protected]>
Co-authored-by: Xiang Gu <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants