Skip to content

Commit

Permalink
Timeout Fixes and VTOrc Improvement (vitessio#11881)
Browse files Browse the repository at this point in the history
* refactor: move tests out of newfeaturestest so that they run on upgrade-downgrade tests too

Signed-off-by: Manan Gupta <[email protected]>

* feat: add failing ers test for handling multiple vttablet failures with default values of flags

Signed-off-by: Manan Gupta <[email protected]>

* feat: add a new lock-timeout flag and use that instead of remote-operation-timeout

Signed-off-by: Manan Gupta <[email protected]>

* feat: augment DownPrimary test to reproduce the issue of VTOrc not handling multiple failures

Signed-off-by: Manan Gupta <[email protected]>

* feat: remove LockShardTimeout configuration from VTOrc and add parallelism to refresh of tablets

Signed-off-by: Manan Gupta <[email protected]>

* log: add more logging lines around ers in vtorc

Signed-off-by: Manan Gupta <[email protected]>

* test: get the test to work

Signed-off-by: Manan Gupta <[email protected]>

* feat: fix usage of wait for replicas timeout

Signed-off-by: Manan Gupta <[email protected]>

* test: fix flags expected output

Signed-off-by: Manan Gupta <[email protected]>

* test: fix race in test now that the function is called in parallel multiple times

Signed-off-by: Manan Gupta <[email protected]>

* feat: fix default of onCloseTimeout to 1 second

Signed-off-by: Manan Gupta <[email protected]>

* test: add failing unit test to refreshTabletsInKeyspaceShard

Signed-off-by: Manan Gupta <[email protected]>

* feat: fix vtorc to not forget a tablet which has been deleted

Signed-off-by: Manan Gupta <[email protected]>

* feat: fix backward compatibility, add tests and release notes docs

Signed-off-by: Manan Gupta <[email protected]>

* test: fix flags output

Signed-off-by: Manan Gupta <[email protected]>

* test: use disable-replication-manager instead of disable-active-reparents to allow vttablets to setup replication when restarted

Signed-off-by: Manan Gupta <[email protected]>

* test: fix flaky test by not checking for an error

Signed-off-by: Manan Gupta <[email protected]>

* feat: handle the case of empty hostname in tablet initialization

Signed-off-by: Manan Gupta <[email protected]>

* feat: update onclose timeout to 10 seconds

Signed-off-by: Manan Gupta <[email protected]>

* test: fix unit test

Signed-off-by: Manan Gupta <[email protected]>

* feat: address review comments

Signed-off-by: Manan Gupta <[email protected]>

* docs: add comments explaining the test functions

Signed-off-by: Manan Gupta <[email protected]>

* feat: add summary docs for 'lock-shard-timeout' deprecation

Signed-off-by: Manan Gupta <[email protected]>

Signed-off-by: Manan Gupta <[email protected]>
  • Loading branch information
GuptaManan100 authored and timvaillancourt committed Mar 15, 2024
1 parent cc61316 commit 7f738a7
Show file tree
Hide file tree
Showing 20 changed files with 374 additions and 210 deletions.
3 changes: 2 additions & 1 deletion go/flags/endtoend/vtbackup.txt
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ Usage of vtbackup:
--keep-alive-timeout duration Wait until timeout elapses after a successful backup before shutting down.
--keep_logs duration keep logs for this long (using ctime) (zero to keep forever)
--keep_logs_by_mtime duration keep logs for this long (using mtime) (zero to keep forever)
--lock-timeout duration Maximum time for which a shard/keyspace lock can be acquired for (default 45s)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--log_err_stacks log stack traces for errors
Expand Down Expand Up @@ -122,7 +123,7 @@ Usage of vtbackup:
--port int port for the server
--pprof strings enable profiling
--purge_logs_interval duration how often try to remove old logs (default 1h0m0s)
--remote_operation_timeout duration time to wait for a remote operation (default 30s)
--remote_operation_timeout duration time to wait for a remote operation (default 15s)
--restart_before_backup Perform a mysqld clean/full restart after applying binlogs, but before taking the backup. Only makes sense to work around xtrabackup bugs.
--s3_backup_aws_endpoint string endpoint of the S3 backend (region must be provided).
--s3_backup_aws_region string AWS region to use. (default "us-east-1")
Expand Down
5 changes: 3 additions & 2 deletions go/flags/endtoend/vtctld.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Usage of vtctld:
--action_timeout duration time to wait for an action before resorting to force (default 2m0s)
--action_timeout duration time to wait for an action before resorting to force (default 1m0s)
--alsologtostderr log to standard error as well as files
--azblob_backup_account_key_file string Path to a file containing the Azure Storage account key; if this flag is unset, the environment variable VT_AZBLOB_ACCOUNT_KEY will be used as the key itself (NOT a file path).
--azblob_backup_account_name string Azure Storage Account name for backups; if this flag is unset, the environment variable VT_AZBLOB_ACCOUNT_NAME will be used.
Expand Down Expand Up @@ -61,6 +61,7 @@ Usage of vtctld:
--keep_logs duration keep logs for this long (using ctime) (zero to keep forever)
--keep_logs_by_mtime duration keep logs for this long (using mtime) (zero to keep forever)
--lameduck-period duration keep running at least this long after SIGTERM before stopping (default 50ms)
--lock-timeout duration Maximum time for which a shard/keyspace lock can be acquired for (default 45s)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--log_err_stacks log stack traces for errors
Expand All @@ -74,7 +75,7 @@ Usage of vtctld:
--pprof strings enable profiling
--proxy_tablets Setting this true will make vtctld proxy the tablet status instead of redirecting to them
--purge_logs_interval duration how often try to remove old logs (default 1h0m0s)
--remote_operation_timeout duration time to wait for a remote operation (default 30s)
--remote_operation_timeout duration time to wait for a remote operation (default 15s)
--s3_backup_aws_endpoint string endpoint of the S3 backend (region must be provided).
--s3_backup_aws_region string AWS region to use. (default "us-east-1")
--s3_backup_aws_retries int AWS request retries. (default -1)
Expand Down
3 changes: 2 additions & 1 deletion go/flags/endtoend/vtgate.txt
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ Usage of vtgate:
--keyspaces_to_watch strings Specifies which keyspaces this vtgate should have access to while routing queries or accessing the vschema.
--lameduck-period duration keep running at least this long after SIGTERM before stopping (default 50ms)
--legacy_replication_lag_algorithm Use the legacy algorithm when selecting vttablets for serving. (default true)
--lock-timeout duration Maximum time for which a shard/keyspace lock can be acquired for (default 45s)
--lock_heartbeat_time duration If there is lock function used. This will keep the lock connection active by using this heartbeat (default 5s)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
Expand Down Expand Up @@ -133,7 +134,7 @@ Usage of vtgate:
--querylog-format string format for query logs ("text" or "json") (default "text")
--querylog-row-threshold uint Number of rows a query has to return or affect before being logged; not useful for streaming queries. 0 means all queries will be logged.
--redact-debug-ui-queries redact full queries and bind variables from debug UI
--remote_operation_timeout duration time to wait for a remote operation (default 30s)
--remote_operation_timeout duration time to wait for a remote operation (default 15s)
--retry-count int retry count (default 2)
--schema_change_signal Enable the schema tracker; requires queryserver-config-schema-change-signal to be enabled on the underlying vttablets for this to work (default true)
--schema_change_signal_user string User to be used to send down query to vttablet to retrieve schema changes
Expand Down
3 changes: 2 additions & 1 deletion go/flags/endtoend/vtgr.txt
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Usage of vtgr:
-h, --help display usage and exit
--keep_logs duration keep logs for this long (using ctime) (zero to keep forever)
--keep_logs_by_mtime duration keep logs for this long (using mtime) (zero to keep forever)
--lock-timeout duration Maximum time for which a shard/keyspace lock can be acquired for (default 45s)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--log_err_stacks log stack traces for errors
Expand All @@ -31,7 +32,7 @@ Usage of vtgr:
--pprof strings enable profiling
--purge_logs_interval duration how often try to remove old logs (default 1h0m0s)
--refresh_interval duration Refresh interval to load tablets. (default 10s)
--remote_operation_timeout duration time to wait for a remote operation (default 30s)
--remote_operation_timeout duration time to wait for a remote operation (default 15s)
--scan_interval duration Scan interval to diagnose and repair. (default 3s)
--scan_repair_timeout duration Time to wait for a Diagnose and repair operation. (default 3s)
--security_policy string the name of a registered security policy to use for controlling access to URLs - empty means allow all for anyone (built-in policies: deny-all, read-only)
Expand Down
4 changes: 2 additions & 2 deletions go/flags/endtoend/vtorc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Usage of vtorc:
--keep_logs duration keep logs for this long (using ctime) (zero to keep forever)
--keep_logs_by_mtime duration keep logs for this long (using mtime) (zero to keep forever)
--lameduck-period duration keep running at least this long after SIGTERM before stopping (default 50ms)
--lock-shard-timeout duration Duration for which a shard lock is held when running a recovery (default 30s)
--lock-timeout duration Maximum time for which a shard/keyspace lock can be acquired for (default 45s)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--log_err_stacks log stack traces for errors
Expand All @@ -39,7 +39,7 @@ Usage of vtorc:
--reasonable-replication-lag duration Maximum replication lag on replicas which is deemed to be acceptable (default 10s)
--recovery-period-block-duration duration Duration for which a new recovery is blocked on an instance after running a recovery (default 30s)
--recovery-poll-duration duration Timer duration on which VTOrc polls its database to run a recovery (default 1s)
--remote_operation_timeout duration time to wait for a remote operation (default 30s)
--remote_operation_timeout duration time to wait for a remote operation (default 15s)
--security_policy string the name of a registered security policy to use for controlling access to URLs - empty means allow all for anyone (built-in policies: deny-all, read-only)
--shutdown_wait_time duration Maximum time to wait for VTOrc to release all the locks that it is holding before shutting down on SIGTERM (default 30s)
--snapshot-topology-interval duration Timer duration on which VTOrc takes a snapshot of the current MySQL information it has in the database. Should be in multiple of hours
Expand Down
3 changes: 2 additions & 1 deletion go/flags/endtoend/vttablet.txt
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,7 @@ Usage of vttablet:
--keep_logs duration keep logs for this long (using ctime) (zero to keep forever)
--keep_logs_by_mtime duration keep logs for this long (using mtime) (zero to keep forever)
--lameduck-period duration keep running at least this long after SIGTERM before stopping (default 50ms)
--lock-timeout duration Maximum time for which a shard/keyspace lock can be acquired for (default 45s)
--lock_tables_timeout duration How long to keep the table locked before timing out (default 1m0s)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
Expand Down Expand Up @@ -240,7 +241,7 @@ Usage of vttablet:
--redact-debug-ui-queries redact full queries and bind variables from debug UI
--relay_log_max_items int Maximum number of rows for VReplication target buffering. (default 5000)
--relay_log_max_size int Maximum buffer size (in bytes) for VReplication target buffering. If single rows are larger than this, a single row is buffered at a time. (default 250000)
--remote_operation_timeout duration time to wait for a remote operation (default 30s)
--remote_operation_timeout duration time to wait for a remote operation (default 15s)
--replication_connect_retry duration how long to wait in between replica reconnect attempts. Only precise to the second. (default 10s)
--restore_concurrency int (init restore parameter) how many concurrent files to restore at once (default 4)
--restore_from_backup (init restore parameter) will check BackupStorage for a recent backup at startup and start there
Expand Down
11 changes: 11 additions & 0 deletions go/internal/flag/flag.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,17 @@ func Parse(fs *flag.FlagSet) {
flag.Parse()
}

// IsFlagProvided returns if the given flag has been provided by the user explicitly or not
func IsFlagProvided(name string) bool {
found := false
flag.Visit(func(f *flag.Flag) {
if f.Name == name {
found = true
}
})
return found
}

// TrickGlog tricks glog into understanding that flags have been parsed.
//
// N.B. Do not delete this function. `glog` is a persnickity package and wants
Expand Down
Loading

0 comments on commit 7f738a7

Please sign in to comment.