-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: configure server keepalive #8535
Conversation
please rebase with the watch keepalive test when #8545 is merged; this can't be merged without tests Also, the keepalive test can drop the +++ b/clientv3/integration/watch_keepalive_test.go
@@ -66,14 +66,12 @@ func TestWatchKeepAlive(t *testing.T) {
// ep[0] keepalive time-out after DialKeepAliveTime + DialKeepAliveTimeout
// wait extra for processing network error for endpoint switching
timeout := ccfg.DialKeepAliveTime + ccfg.DialKeepAliveTimeout + ccfg.DialTimeout
- time.Sleep(timeout)
-
if _, err = clus.Client(1).Put(context.TODO(), "foo", "bar"); err != nil {
t.Fatal(err)
}
select {
case <-wch:
- case <-time.After(5 * time.Second):
+ case <-time.After(timeout):
t.Fatal("took too long to receive events")
}
@@ -82,14 +80,12 @@ func TestWatchKeepAlive(t *testing.T) {
defer clus.Members[1].Unblackhole()
// wait for ep[0] recover, ep[1] fail
- time.Sleep(timeout)
-
if _, err = clus.Client(0).Put(context.TODO(), "foo", "bar"); err != nil {
t.Fatal(err)
}
select {
case <-wch:
- case <-time.After(5 * time.Second):
+ case <-time.After(timeout):
t.Fatal("took too long to receive events")
}
} |
496df50
to
4020dfa
Compare
aeea06b
to
bfea25b
Compare
what is the blocked on? |
bfea25b
to
6df878a
Compare
e41d020
to
47ae89e
Compare
embed/config.go
Outdated
SnapCount: etcdserver.DefaultSnapCount, | ||
MaxTxnOps: DefaultMaxTxnOps, | ||
MaxRequestBytes: DefaultMaxRequestBytes, | ||
GRPCKeepAliveMinTime: 5 * time.Second, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define this as a const?
etcdmain/config.go
Outdated
@@ -143,6 +144,9 @@ func newConfig() *config { | |||
fs.Int64Var(&cfg.QuotaBackendBytes, "quota-backend-bytes", cfg.QuotaBackendBytes, "Raise alarms when backend size exceeds the given quota. 0 means use the default quota.") | |||
fs.UintVar(&cfg.MaxTxnOps, "max-txn-ops", cfg.MaxTxnOps, "Maximum number of operations permitted in a transaction.") | |||
fs.UintVar(&cfg.MaxRequestBytes, "max-request-bytes", cfg.MaxRequestBytes, "Maximum client request size in bytes the server will accept.") | |||
fs.DurationVar(&cfg.GRPCKeepAliveMinTime, "grpc-keepalive-min-time", cfg.Config.GRPCKeepAliveMinTime, "Minimum interval duration that a client should wait before pinging server.") | |||
fs.DurationVar(&cfg.GRPCKeepAliveInterval, "grpc-keepalive-interval", time.Duration(0), "Frequency duration of server-to-client ping to check if a connection is alive.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explain what does the default value do?
etcdmain/config.go
Outdated
@@ -143,6 +144,9 @@ func newConfig() *config { | |||
fs.Int64Var(&cfg.QuotaBackendBytes, "quota-backend-bytes", cfg.QuotaBackendBytes, "Raise alarms when backend size exceeds the given quota. 0 means use the default quota.") | |||
fs.UintVar(&cfg.MaxTxnOps, "max-txn-ops", cfg.MaxTxnOps, "Maximum number of operations permitted in a transaction.") | |||
fs.UintVar(&cfg.MaxRequestBytes, "max-request-bytes", cfg.MaxRequestBytes, "Maximum client request size in bytes the server will accept.") | |||
fs.DurationVar(&cfg.GRPCKeepAliveMinTime, "grpc-keepalive-min-time", cfg.Config.GRPCKeepAliveMinTime, "Minimum interval duration that a client should wait before pinging server.") | |||
fs.DurationVar(&cfg.GRPCKeepAliveInterval, "grpc-keepalive-interval", time.Duration(0), "Frequency duration of server-to-client ping to check if a connection is alive.") | |||
fs.DurationVar(&cfg.GRPCKeepAliveTimeout, "grpc-keepalive-timeout", time.Duration(0), "Additional duration of wait before closing a non-responsive connection.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should find a good default value to enable keepalive related options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defined default interval 5s, timeout 3s.
clus.Members[0].Blackhole() | ||
|
||
// expects endpoint switch to ep[1] | ||
cli.SetEndpoints(clus.Members[0].GRPCAddr(), clus.Members[1].GRPCAddr()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will setenpoints cause a shuffle directly? or client will still pin to the original one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will trigger gRPC to close ep[0], and client will pin ep[1].
|
||
clus := integration.NewClusterV3(t, &integration.ClusterConfig{ | ||
Size: 3, | ||
GRPCKeepAliveMinTime: time.Millisecond, // avoid too_many_pings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are testing the client-side keepalive here, why do we need to enable server-side keepalive? we probably can also move this test into another pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed server parameters for this test, leaving server-to-client ping test as TODO.
47ae89e
to
955f1a1
Compare
955f1a1
to
75c1d4d
Compare
75c1d4d
to
3f51072
Compare
Signed-off-by: Gyu-Ho Lee <[email protected]>
3f51072
to
d3d7bb3
Compare
d3d7bb3
to
72d7693
Compare
Signed-off-by: Gyu-Ho Lee <[email protected]>
72d7693
to
7c8fa01
Compare
Signed-off-by: Gyu-Ho Lee <[email protected]>
7c8fa01
to
4b3d400
Compare
@xiang90 Used default values in grpc-go for keepalive server interval and timeout. PTAL. |
@@ -70,6 +70,12 @@ member flags: | |||
maximum number of operations permitted in a transaction. | |||
--max-request-bytes '1572864' | |||
maximum client request size in bytes the server will accept. | |||
--grpc-keepalive-min-time '5s' | |||
minimum duration interval that a client should wait before pinging server. | |||
--grpc-keepalive-interval '2h' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2h seems too high. probably 5 minutes or something? is there a recommended vault from gRPC-go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gRPC Go defaults to 2h. Should etcd use 5min?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2h is fine if it is gRPC's default.
lgtm |
From #8477.
For #8022.