-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tidb panic after inject network loss chaos repeatly #33265
Comments
/type bug |
It seems the panic is caused by a nil longest sleep config is returned and then is ued. I think it's caused by that |
Let me see see |
@mayjiang0203 |
Agree. Who will be responsible for solving this issue? Are you working on it? @cfzjywxk |
@youjiali1995 |
yes |
If an expired backoffer is cloned or forked, the cloned or forked one will panic on next time backoff. // TestBackoffErrorTypeWithForkAndClone, see https://github.com/pingcap/tidb/issues/33265
{
b := NewBackofferWithVars(context.TODO(), 200, nil)
// 700 ms sleep in total and the backoffer will return an error next time.
for i := 0; i < 3; i++ {
err = b.Backoff(BoMaxDataNotReady, errors.New("data not ready"))
assert.Nil(t, err)
}
bForked, cancel := b.Fork()
defer cancel()
bCloned := b.Clone()
for _, b := range []*Backoffer{bForked, bCloned} {
err = b.Backoff(BoTiKVRPC, errors.New("tikv rpc"))
assert.ErrorIs(t, err, BoMaxDataNotReady.err)
}
}
BTW, my vim is broken. I have to fix it ASAP. Could you work on it? @sticnarf |
@youjiali1995 |
Looking at the use cases of |
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
running test case oltp_rel_008_004 in test plan endless-oltp-tpcc-large-raft-engine-rel@main.
Run workload, then inject network loss chaos to gc leader repeatly as following,
[2022/03/18 00:43:17.831 +08:00] [INFO] [chaos.go:358] ["fault will last for"] [duration=2m36s]
[2022/03/18 00:43:17.836 +08:00] [INFO] [chaos.go:86] ["Run chaos"] [name="gc leader network loss"] [selectors="[endless-oltp-tps-661868-1-536/tc-tidb-1]"] [experiment="{"Duration":"","Scheduler":null,"Loss":"75","Correlation":"50"}"]
[2022/03/18 00:45:53.861 +08:00] [INFO] [chaos.go:151] ["Clean chaos"] [name="gc leader network loss"] [chaosId="ns=endless-oltp-tps-661868-1-536,kind=network-loss,name=network-loss-jnmydpgf,spec=&k8s.ChaosIdentifier{Namespace:"endless-oltp-tps-661868-1-536", Name:"network-loss-jnmydpgf", Spec:NetworkLossSpec{Duration: "", Scheduler: , Loss: "75", Correlation: "50"}}"]
[2022/03/18 00:46:08.872 +08:00] [INFO] [chaos.go:358] ["fault will last for"] [duration=3m31s]
[2022/03/18 00:46:08.877 +08:00] [INFO] [chaos.go:86] ["Run chaos"] [name="gc leader network loss"] [selectors="[endless-oltp-tps-661868-1-536/tc-tidb-1]"] [experiment="{"Duration":"","Scheduler":null,"Loss":"75","Correlation":"50"}"]
[2022/03/18 00:49:39.891 +08:00] [INFO] [chaos.go:151] ["Clean chaos"] [name="gc leader network loss"] [chaosId="ns=endless-oltp-tps-661868-1-536,kind=network-loss,name=network-loss-lfgkopjb,spec=&k8s.ChaosIdentifier{Namespace:"endless-oltp-tps-661868-1-536", Name:"network-loss-lfgkopjb", Spec:NetworkLossSpec{Duration: "", Scheduler: , Loss: "75", Correlation: "50"}}"]
[2022/03/18 00:50:53.903 +08:00] [INFO] [chaos.go:358] ["fault will last for"] [duration=2m29s]
[2022/03/18 00:50:53.908 +08:00] [INFO] [chaos.go:86] ["Run chaos"] [name="gc leader network loss"] [selectors="[endless-oltp-tps-661868-1-536/tc-tidb-1]"] [experiment="{"Duration":"","Scheduler":null,"Loss":"75","Correlation":"50"}"]
[2022/03/18 00:53:22.921 +08:00] [INFO] [chaos.go:151] ["Clean chaos"] [name="gc leader network loss"] [chaosId="ns=endless-oltp-tps-661868-1-536,kind=network-loss,name=network-loss-wqmyzxbr,spec=&k8s.ChaosIdentifier{Namespace:"endless-oltp-tps-661868-1-536", Name:"network-loss-wqmyzxbr", Spec:NetworkLossSpec{Duration: "", Scheduler: , Loss: "75", Correlation: "50"}}"]
2. What did you expect to see? (Required)
No panic or oom occur.
3. What did you see instead (Required)
Tidb panic.
4. What is your TiDB version? (Required)
[2022/03/17 23:28:30.616 +08:00] [INFO] [client.go:376] ["Cluster version information"] [type=tikv] [version=6.0.0-alpha] [git_hash=819ac9d64f22eb346764329a30cdeac3570e6cec]
[2022/03/17 23:28:30.616 +08:00] [INFO] [client.go:376] ["Cluster version information"] [type=pd] [version=6.0.0-nightly] [git_hash=e278c6c3d83087001843a596834fd2eb080ad281]
[2022/03/17 23:28:30.616 +08:00] [INFO] [client.go:376] ["Cluster version information"] [type=tidb] [version=6.0.0-nightly] [git_hash=d5867b1dba5bea3433ebe6b9eb17ba63bb6e3e74]
The text was updated successfully, but these errors were encountered: