Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hot scheduler causes pd panic when scale out #3868

Closed
lhy1024 opened this issue Jul 13, 2021 · 2 comments · Fixed by #3870
Closed

hot scheduler causes pd panic when scale out #3868

lhy1024 opened this issue Jul 13, 2021 · 2 comments · Fixed by #3870
Assignees
Labels
component/scheduler Scheduler logic. type/bug The issue is confirmed as a bug.

Comments

@lhy1024
Copy link
Contributor

lhy1024 commented Jul 13, 2021

Bug Report

What did you do?

scale out a tikv

What did you expect to see?

run

What did you see instead?

panic

tikv connected failed with pd between put store and first store heartbeat

tikv log

[2021/07/13 14:47:15.365 +08:00] [INFO] [node.rs:174] ["put store to PD"] [store="id: 4311770 address: \"xxx.xxx.17.106:20172\" labels { key: \"host\" value: \"smtikv106\" } labels { key: \"dc\" value: \"sm\" } version: \"4.0.8\" status_address: \"0.0.0.0:20182\" git_hash: \"83091173e960e5a0f5f417e921a0801d2f6635ae\" start_timestamp: 1626158835 deploy_path: \"/data/tidb_cluster/tikv2/deploy/bin\""]
[2021/07/13 14:47:15.369 +08:00] [INFO] [mod.rs:335] ["starting working thread"] [worker=cdc]
[2021/07/13 14:47:15.369 +08:00] [INFO] [future.rs:136] ["starting working thread"] [worker=waiter-manager]
[2021/07/13 14:47:15.369 +08:00] [INFO] [future.rs:136] ["starting working thread"] [worker=deadlock-detector]
[2021/07/13 14:47:15.370 +08:00] [INFO] [mod.rs:335] ["starting working thread"] [worker=backup-endpoint]
[2021/07/13 14:47:15.370 +08:00] [INFO] [mod.rs:335] ["starting working thread"] [worker=snap-handler]
[2021/07/13 14:47:15.370 +08:00] [INFO] [server.rs:223] ["listening on addr"] [addr=0.0.0.0:20172]
[2021/07/13 14:47:15.386 +08:00] [INFO] [server.rs:248] ["TiKV is ready to serve"]
[2021/07/13 14:47:15.387 +08:00] [WARN] [mod.rs:489] ["failed to register addr to pd"] [body=Body(Streaming)] ["status code"=400]
[2021/07/13 14:47:15.387 +08:00] [WARN] [mod.rs:489] ["failed to register addr to pd"] [body=Body(Streaming)] ["status code"=400]
[2021/07/13 14:47:15.388 +08:00] [WARN] [mod.rs:489] ["failed to register addr to pd"] [body=Body(Streaming)] ["status code"=400]
[2021/07/13 14:47:15.388 +08:00] [WARN] [mod.rs:489] ["failed to register addr to pd"] [body=Body(Streaming)] ["status code"=400]
[2021/07/13 14:47:15.388 +08:00] [WARN] [mod.rs:489] ["failed to register addr to pd"] [body=Body(Streaming)] ["status code"=400]
[2021/07/13 14:47:15.388 +08:00] [WARN] [mod.rs:499] ["failed to register addr to pd after 5 tries"]
[2021/07/13 14:47:25.816 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"Connection reset by peer\") }))"]
[2021/07/13 14:47:25.816 +08:00] [INFO] [<unknown>] ["Connect failed: {\"created\":\"@1626158845.816582340\",\"description\":\"Failed to connect to remote host: Connection refused\",\"errno\":111,\"file\":\"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc\",\"file_line\":200,\"os_error\":\"Connection refused\",\"syscall\":\"connect\",\"target_address\":\"ipv4:xxx.xxx.17.5:2379\"}"]
[2021/07/13 14:47:25.816 +08:00] [INFO] [<unknown>] ["Subchannel 0x7f4aafc83000: Retry in 1000 milliseconds"]
[2021/07/13 14:47:25.816 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:25.816 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:25.816 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.18.5:2379]
[2021/07/13 14:47:25.817 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3a330 for subchannel 0x7f4aafc83380"]
[2021/07/13 14:47:25.825 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.17.5:2379]
[2021/07/13 14:47:26.816 +08:00] [INFO] [<unknown>] ["Failed to connect to channel, retrying"]
[2021/07/13 14:47:26.817 +08:00] [INFO] [<unknown>] ["Connect failed: {\"created\":\"@1626158846.816852523\",\"description\":\"Failed to connect to remote host: Connection refused\",\"errno\":111,\"file\":\"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc\",\"file_line\":200,\"os_error\":\"Connection refused\",\"syscall\":\"connect\",\"target_address\":\"ipv4:xxx.xxx.17.5:2379\"}"]
[2021/07/13 14:47:26.817 +08:00] [INFO] [<unknown>] ["Subchannel 0x7f4aafc83000: Retry in 1396 milliseconds"]
[2021/07/13 14:47:27.817 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:27.817 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.18.5:2379]
[2021/07/13 14:47:27.818 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3a450 for subchannel 0x7f4aafc83380"]
[2021/07/13 14:47:27.820 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.17.5:2379]
[2021/07/13 14:47:28.213 +08:00] [INFO] [<unknown>] ["Failed to connect to channel, retrying"]
[2021/07/13 14:47:28.213 +08:00] [INFO] [<unknown>] ["Connect failed: {\"created\":\"@1626158848.213509371\",\"description\":\"Failed to connect to remote host: Connection refused\",\"errno\":111,\"file\":\"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc\",\"file_line\":200,\"os_error\":\"Connection refused\",\"syscall\":\"connect\",\"target_address\":\"ipv4:xxx.xxx.17.5:2379\"}"]
[2021/07/13 14:47:28.213 +08:00] [INFO] [<unknown>] ["Subchannel 0x7f4aafc83000: Retry in 2706 milliseconds"]
[2021/07/13 14:47:29.214 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:29.214 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.18.5:2379]
[2021/07/13 14:47:29.215 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3a510 for subchannel 0x7f4aafc83380"]
[2021/07/13 14:47:29.216 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.17.5:2379]
[2021/07/13 14:47:30.919 +08:00] [INFO] [<unknown>] ["Failed to connect to channel, retrying"]
[2021/07/13 14:47:30.919 +08:00] [INFO] [<unknown>] ["Connect failed: {\"created\":\"@1626158850.919828824\",\"description\":\"Failed to connect to remote host: Connection refused\",\"errno\":111,\"file\":\"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc\",\"file_line\":200,\"os_error\":\"Connection refused\",\"syscall\":\"connect\",\"target_address\":\"ipv4:xxx.xxx.17.5:2379\"}"]
[2021/07/13 14:47:30.919 +08:00] [INFO] [<unknown>] ["Subchannel 0x7f4aafc83000: Retry in 3328 milliseconds"]
[2021/07/13 14:47:31.920 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:31.920 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.18.5:2379]
[2021/07/13 14:47:31.921 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3a5a0 for subchannel 0x7f4aafc83380"]
[2021/07/13 14:47:31.923 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.17.5:2379]
[2021/07/13 14:47:34.247 +08:00] [INFO] [<unknown>] ["Failed to connect to channel, retrying"]
[2021/07/13 14:47:34.924 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:34.924 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.18.5:2379]
[2021/07/13 14:47:34.925 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3a660 for subchannel 0x7f4aafc83380"]
[2021/07/13 14:47:34.926 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.17.5:2379]
[2021/07/13 14:47:34.926 +08:00] [INFO] [<unknown>] ["Connect failed: {\"created\":\"@1626158854.926677211\",\"description\":\"Failed to connect to remote host: Connection refused\",\"errno\":111,\"file\":\"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc\",\"file_line\":200,\"os_error\":\"Connection refused\",\"syscall\":\"connect\",\"target_address\":\"ipv4:xxx.xxx.17.5:2379\"}"]
[2021/07/13 14:47:34.926 +08:00] [INFO] [<unknown>] ["Subchannel 0x7f4aafc83000: Retry in 6927 milliseconds"]
[2021/07/13 14:47:35.365 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:35.365 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:35.366 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:35.366 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.18.5:2379]
[2021/07/13 14:47:35.366 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3a780 for subchannel 0x7f4ab42911c0"]
[2021/07/13 14:47:35.368 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.17.5:2379]
[2021/07/13 14:47:35.927 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:35.927 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.18.5:2379]
[2021/07/13 14:47:35.928 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3a840 for subchannel 0x7f4aafc83380"]
[2021/07/13 14:47:35.929 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.17.5:2379]
[2021/07/13 14:47:38.370 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:38.370 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.18.5:2379]
[2021/07/13 14:47:38.371 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3a900 for subchannel 0x7f4ab42911c0"]
[2021/07/13 14:47:38.931 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:38.931 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.18.5:2379]
[2021/07/13 14:47:38.932 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3a9c0 for subchannel 0x7f4aafc83380"]
[2021/07/13 14:47:39.372 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(\"failed to connect to all addresses\") }))"]
[2021/07/13 14:47:39.372 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.18.5:2379]
[2021/07/13 14:47:39.373 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3aa80 for subchannel 0x7f4ab42911c0"]
[2021/07/13 14:47:39.374 +08:00] [INFO] [util.rs:419] ["connecting to PD endpoint"] [endpoints=http://xxx.xxx.19.5:2379]
[2021/07/13 14:47:39.375 +08:00] [INFO] [<unknown>] ["New connected subchannel at 0x7f4ab1c3ab40 for subchannel 0x7f4ab42911c0"]
[2021/07/13 14:47:39.376 +08:00] [INFO] [util.rs:484] ["connected to PD leader"] [endpoints=http://xxx.xxx.19.5:2379]
[2021/07/13 14:47:39.377 +08:00] [INFO] [util.rs:190] ["heartbeat sender and receiver are stale, refreshing ..."]
[2021/07/13 14:47:39.377 +08:00] [WARN] [util.rs:209] ["updating PD client done"] [spend=4.282148ms]
[2021/07/13 14:47:39.378 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.379 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.380 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.381 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.381 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.382 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.383 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.384 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.385 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.386 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.387 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.388 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.389 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.390 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.390 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.391 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.392 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.393 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.394 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.395 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.396 +08:00] [ERROR] [util.rs:301] ["request failed, retry"] [err_code=KV:Unknown] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]
[2021/07/13 14:47:39.397 +08:00] [ERROR] [pd.rs:677] ["store heartbeat failed"] [err="Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(\"rpc error: code = Unavailable desc = not leader\") }))"]

pd log

[2021/07/13 14:47:15.369 +08:00] [INFO] [grpc_service.go:236] ["put store ok"] [store="id:4311770 address:\"xxx.xxx.17.106:20172\" labels:<key:\"host\" value:\"smtikv106\" > labels:<key:\"dc\" value:\"sm\" > version:\"4.0.8\" status_address:\"0.0.0.0:20182\" git_hash:\"83091173e960e5a0f5f417e921a0801d2f6635ae\" start_timestamp:1626158835 deploy_path:\"/data/tidb_cluster/tikv2/deploy/bin\" "]
[2021/07/13 14:47:15.369 +08:00] [INFO] [util.go:78] ["load cluster version"] [cluster-version=4.0.8]
[2021/07/13 14:47:15.389 +08:00] [INFO] [manager.go:94] ["address has already been registered"] [component=tikv] [address=0.0.0.0:20182]
[2021/07/13 14:47:15.389 +08:00] [INFO] [manager.go:94] ["address has already been registered"] [component=tikv] [address=0.0.0.0:20182]
[2021/07/13 14:47:15.390 +08:00] [INFO] [manager.go:94] ["address has already been registered"] [component=tikv] [address=0.0.0.0:20182]
[2021/07/13 14:47:15.390 +08:00] [INFO] [operator_controller.go:620] ["send schedule command"] [region-id=3466790] [step="promote learner peer 4311748 on store 4311459 to voter"] [source=heartbeat]
[2021/07/13 14:47:15.390 +08:00] [INFO] [manager.go:94] ["address has already been registered"] [component=tikv] [address=0.0.0.0:20182]
[2021/07/13 14:47:15.390 +08:00] [INFO] [manager.go:94] ["address has already been registered"] [component=tikv] [address=0.0.0.0:20182]
...
[2021/07/13 14:47:25.380 +08:00] [FATAL] [log.go:292] [panic] [recover="\"invalid memory address or nil pointer dereference\""] [stack="github.com/pingcap/log.Fatal\n\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/pkg/mod/github.com/pingcap/[email protected]/global.go:59\ngithub.com/tikv/pd/pkg/logutil.LogPanic\n\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/pkg/logutil/log.go:292\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:679\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:199\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:394\ngithub.com/tikv/pd/server/schedulers.(*balanceSolver).pickDstStores\n\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:809\ngithub.com/tikv/pd/server/schedulers.(*balanceSolver).filterDstStores\n\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:801\ngithub.com/tikv/pd/server/schedulers.(*balanceSolver).solve\n\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:592\ngithub.com/tikv/pd/server/schedulers.(*hotScheduler).balanceHotWriteRegions\n\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:469\ngithub.com/tikv/pd/server/schedulers.(*hotScheduler).dispatch\n\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:189\ngithub.com/tikv/pd/server/schedulers.(*hotScheduler).Schedule\n\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:176\ngithub.com/tikv/pd/server/cluster.(*scheduleController).Schedule\n\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/cluster/coordinator.go:697\ngithub.com/tikv/pd/server/cluster.(*coordinator).runScheduler\n\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/cluster/coordinator.go:648"]

stack

github.com/pingcap/log.Fatal
	/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/pkg/mod/github.com/pingcap/[email protected]/global.go:59
github.com/tikv/pd/pkg/logutil.LogPanic
	/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/pkg/logutil/log.go:292
runtime.gopanic
	/usr/local/go/src/runtime/panic.go:679
runtime.panicmem
	/usr/local/go/src/runtime/panic.go:199
runtime.sigpanic
	/usr/local/go/src/runtime/signal_unix.go:394
github.com/tikv/pd/server/schedulers.(*balanceSolver).pickDstStores
	/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:809
github.com/tikv/pd/server/schedulers.(*balanceSolver).filterDstStores
	/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:801
github.com/tikv/pd/server/schedulers.(*balanceSolver).solve
	/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:592
github.com/tikv/pd/server/schedulers.(*hotScheduler).balanceHotWriteRegions
	/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:469
github.com/tikv/pd/server/schedulers.(*hotScheduler).dispatch
	/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:189
github.com/tikv/pd/server/schedulers.(*hotScheduler).Schedule
	/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/schedulers/hot_region.go:176
github.com/tikv/pd/server/cluster.(*scheduleController).Schedule
	/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/cluster/coordinator.go:697
github.com/tikv/pd/server/cluster.(*coordinator).runScheduler
	/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/server/cluster/coordinator.go:648

What version of PD are you using (pd-server -V)?

4.0.8

@lhy1024 lhy1024 added type/bug The issue is confirmed as a bug. component/scheduler Scheduler logic. labels Jul 13, 2021
@nolouch
Copy link
Contributor

nolouch commented Jul 13, 2021

/assign @lhy1024

@lhy1024
Copy link
Contributor Author

lhy1024 commented Jul 14, 2021

It has been fixed by #3483. Besides, 4.0.12 has this fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/scheduler Scheduler logic. type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants