Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jepsen monotonic workload detected an anomaly #34875

Closed
zyguan opened this issue May 23, 2022 · 2 comments
Closed

Jepsen monotonic workload detected an anomaly #34875

zyguan opened this issue May 23, 2022 · 2 comments
Assignees
Labels
affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.1 severity/critical sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.

Comments

@zyguan
Copy link
Contributor

zyguan commented May 23, 2022

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

run jepsen monotonic with the following options

{:auto-retry-limit :default,
 :ssh-private-key "jepsen.pem",
 :isolation :repeatable-read,
 :concurrency 15,
 :predicate-read false,
 :tarball-url
 "http://minio.pingcap.net:9000/tp-team/tests/jepsen/tidb-master-linux-amd64.tar.gz",
 :txn-mode "mixed",
 :force-reinstall true,
 :update-in-place false,
 :ssh
 {:username "root",
  :password "root",
  :strict-host-key-checking false,
  :private-key-path "jepsen.pem"},
 :nemesis-long-recovery false,
 :follower-read false,
 :nemesis
 {:interval 10,
  :schedules true,
  :partition-pd-leader true,
  :partition-half true,
  :partition-ring true},
 :nodes
 ["node-0.node-peer.jepsen-tps-810169-1-10"
  "node-1.node-peer.jepsen-tps-810169-1-10"
  "node-2.node-peer.jepsen-tps-810169-1-10"
  "node-3.node-peer.jepsen-tps-810169-1-10"
  "node-4.node-peer.jepsen-tps-810169-1-10"],
 :test-count 1,
 :read-lock "FOR UPDATE",
 :use-index true,
 :os #object[tidb.core.Image 0x19fec3d6 "tidb.core.Image@19fec3d6"],
 :time-limit 300,
 :recovery-time 10,
 :version "master",
 :auto-retry :default,
 :workload :monotonic,
 :init-txn-sql nil,
 :init-sql
 ("set @@tidb_enable_mutation_checker=1, @@tidb_txn_assertion_level=strict")}

2. What did you expect to see? (Required)

Test passed normally.

3. What did you see instead (Required)

Test failed.

 {:cycle
  {:valid? false,
   :scc-count 1,
   :cycles
   ["Let:\n  T1 = {:type :ok, :f :inc, :value {4 1023}, :process 33, :time 80444781019, :txn-info {:txn_scope \"global\", :start_ts 433284503116447815, :commit_ts 433284503129554961, :txn_commit_mode \"1pc\", :async_commit_fallback false, :one_pc_fallback false}, :index 33661}\n  T2 = {:type :ok, :f :read, :value {0 1062, 1 1071, 2 1029, 3 1031, 4 1024, 5 1053, 6 1055, 7 1072}, :process 29, :time 80725761979, :txn-info {:start_ts 433284503181983752}, :index 33751}\n  T3 = {:type :ok, :f :read, :value {0 1069, 1 1073, 2 1037, 3 1036, 4 1031, 5 1057, 6 1058, 7 1073}, :process 43, :time 81198587219, :txn-info {:start_ts 433284503313055755}, :index 33877}\n  T4 = {:type :ok, :f :read, :value {0 1071, 1 1074, 2 1039, 3 1039, 4 1033, 5 1059, 6 1061, 7 1078}, :process 39, :time 81530907442, :txn-info {:start_ts 433284503391698963}, :index 33965}\n  T5 = {:type :ok, :f :read, :value {0 1077, 1 1077, 2 1044, 3 1046, 4 1034, 5 1065, 6 1064, 7 1083}, :process 16, :time 82418461059, :txn-info {:start_ts 433284503601414154}, :index 34181}\n  T6 = {:type :ok, :f :read, :value {0 1081, 1 1082, 2 1046, 3 1051, 4 1040, 5 1073, 6 1065, 7 1085}, :process 19, :time 82904071752, :txn-info {:start_ts 433284503758962694}, :index 34337}\n  T7 = {:type :ok, :f :inc, :value {7 1089}, :process 33, :time 83044825018, :txn-info {:txn_scope \"global\", :start_ts 433284503784915029, :commit_ts 433284503784915037, :txn_commit_mode \"1pc\", :async_commit_fallback false, :one_pc_fallback false}, :index 34375}\n  T8 = {:type :ok, :f :read, :value {0 1087, 1 1087, 2 1052, 3 1057, 4 1044, 5 1077, 6 1073, 7 1090}, :process 21, :time 83470712700, :txn-info {:start_ts 433284503902879775}, :index 34499}\n  T9 = {:type :ok, :f :read, :value {0 1094, 1 1097, 2 1063, 3 1069, 4 1054, 5 1088, 6 1074, 7 1102}, :process 43, :time 84308310161, :txn-info {:start_ts 433284504125702226}, :index 34781}\n  T10 = {:type :ok, :f :read, :value {0 1094, 1 1098, 2 1064, 3 1071, 4 1054, 5 1089, 6 1075, 7 1103}, :process 26, :time 84351858520, :txn-info {:start_ts 433284504138809442}, :index 34811}\n  T11 = {:type :ok, :f :read, :value {0 1096, 1 1101, 2 1066, 3 1072, 4 1059, 5 1094, 6 1077, 7 1106}, :process 26, :time 84622443292, :txn-info {:start_ts 433284504204345433}, :index 34915}\n  T12 = {:type :ok, :f :read, :value {0 1100, 1 1105, 2 1072, 3 1078, 4 1063, 5 1097, 6 1079, 7 1107}, :process 43, :time 84955700460, :txn-info {:start_ts 433284504296357967}, :index 35043}\n  T13 = {:type :ok, :f :read, :value {0 1107, 1 1111, 2 1076, 3 1079, 4 1069, 5 1104, 6 1083, 7 1114}, :process 23, :time 85386720347, :txn-info {:start_ts 433284504414060597}, :index 35213}\n  T14 = {:type :ok, :f :read, :value {0 1113, 1 1116, 2 1077, 3 1083, 4 1072, 5 1109, 6 1088, 7 1118}, :process 29, :time 85645760999, :txn-info {:start_ts 433284504479858707}, :index 35323}\n  T15 = {:type :ok, :f :read, :value {0 1118, 1 1120, 2 1082, 3 1084, 4 1076, 5 1113, 6 1092, 7 1123}, :process 43, :time 85974074525, :txn-info {:start_ts 433284504571346980}, :index 35455}\n  T16 = {:type :ok, :f :read, :value {0 1124, 1 1123, 2 1084, 3 1088, 4 1079, 5 1114, 6 1094, 7 1127}, :process 33, :time 86146360380, :txn-info {:start_ts 433284504610930799}, :index 35547}\n  T17 = {:type :ok, :f :read, :value {0 1129, 1 1126, 2 1087, 3 1093, 4 1080, 5 1117, 6 1101, 7 1134}, :process 33, :time 86354841835, :txn-info {:start_ts 433284504664408208}, :index 35659}\n  T18 = {:type :ok, :f :read, :value {0 1136, 1 1133, 2 1093, 3 1094, 4 1089, 5 1122, 6 1109, 7 1139}, :process 43, :time 86733600156, :txn-info {:start_ts 433284504754847854}, :index 35851}\n  T19 = {:type :ok, :f :read, :value {0 1141, 1 1138, 2 1097, 3 1096, 4 1090, 5 1125, 6 1110, 7 1141}, :process 26, :time 86886212286, :txn-info {:start_ts 433284504807276607}, :index 35943}\n  T20 = {:type :ok, :f :read, :value {0 1143, 1 1142, 2 1102, 3 1099, 4 1093, 5 1126, 6 1112, 7 1143}, :process 26, :time 87046642663, :txn-info {:start_ts 433284504846598239}, :index 36033}\n  T21 = {:type :ok, :f :read, :value {0 1153, 1 1147, 2 1109, 3 1107, 4 1102, 5 1130, 6 1119, 7 1144}, :process 29, :time 87512493020, :txn-info {:start_ts 433284504951455857}, :index 36266}\n  T22 = {:type :ok, :f :inc, :value {6 1123}, :process 26, :time 87523233238, :txn-info {:txn_scope \"global\", :start_ts 433284504977932383, :commit_ts 433284504977932398, :txn_commit_mode \"1pc\", :async_commit_fallback false, :one_pc_fallback false}, :index 36279}\n  T23 = {:type :ok, :f :read, :value {0 1159, 1 1151, 2 1114, 3 1116, 4 1110, 5 1136, 6 1124, 7 1152}, :process 33, :time 87820294039, :txn-info {:start_ts 433284505043206178}, :index 36439}\n  T24 = {:type :ok, :f :read, :value {0 1161, 1 1155, 2 1116, 3 1119, 4 1111, 5 1139, 6 1126, 7 1156}, :process 26, :time 87959148938, :txn-info {:start_ts 433284505082527879}, :index 36523}\n  T25 = {:type :ok, :f :read, :value {0 1163, 1 1157, 2 1117, 3 1121, 4 1114, 5 1141, 6 1128, 7 1158}, :process 39, :time 88046719620, :txn-info {:start_ts 433284505108742239}, :index 36577}\n  T26 = {:type :ok, :f :inc, :value {4 1115}, :process 26, :time 88103410666, :txn-info {:txn_scope \"global\", :start_ts 433284505135218710, :commit_ts 433284505135218720, :txn_commit_mode \"1pc\", :async_commit_fallback false, :one_pc_fallback false}, :index 36603}\n  T27 = {:type :ok, :f :read, :value {0 1169, 1 1169, 2 1123, 3 1126, 4 1116, 5 1146, 6 1140, 7 1168}, :process 26, :time 92344811746, :txn-info {:start_ts 433284506078937104}, :index 36813}\n  T28 = {:type :ok, :f :read, :value {0 1170, 1 1171, 2 1125, 3 1127, 4 1117, 5 1147, 6 1141, 7 1172}, :process 39, :time 103805816628, :txn-info {:start_ts 433284508149612551}, :index 36888}\n  T29 = {:type :ok, :f :read, :value {0 1172, 1 1175, 2 1127, 3 1132, 4 1120, 5 955, 6 1145, 7 1173}, :process 140, :time 141968369160, :txn-info {:start_ts 433284517404344338}, :index 37075}\n  T30 = {:type :ok, :f :read, :value {0 967, 1 987, 2 951, 3 931, 4 939, 5 956, 6 948, 7 972}, :process 24, :time 59899994458, :txn-info {:start_ts 433284497716543541}, :index 30667}\n  T31 = {:type :ok, :f :inc, :value {3 951}, :process 26, :time 67735927301, :txn-info {:txn_scope \"global\", :start_ts 433284499787219008, :commit_ts 433284499787219036, :txn_commit_mode \"1pc\", :async_commit_fallback false, :one_pc_fallback false}, :index 31499}\n  T32 = {:type :ok, :f :read, :value {0 998, 1 1012, 2 978, 3 952, 4 962, 5 987, 6 986, 7 1017}, :process 43, :time 68199062884, :txn-info {:start_ts 433284499905183774}, :index 31651}\n  T33 = {:type :ok, :f :read, :value {0 1007, 1 1020, 2 980, 3 953, 4 963, 5 993, 6 994, 7 1021}, :process 21, :time 68929604215, :txn-info {:start_ts 433284500101791758}, :index 31819}\n  T34 = {:type :ok, :f :inc, :value {3 1014}, :process 33, :time 79313497594, :txn-info {:txn_scope \"global\", :start_ts 433284502814982197, :commit_ts 433284502828089372, :txn_commit_mode \"1pc\", :async_commit_fallback false, :one_pc_fallback false}, :index 33205}\n  T35 = {:type :ok, :f :read, :value {0 1046, 1 1058, 2 1016, 3 1015, 4 1011, 5 1043, 6 1038, 7 1051}, :process 43, :time 79507649071, :txn-info {:start_ts 433284502867411020}, :index 33285}\n  T36 = {:type :ok, :f :read, :value {0 1048, 1 1059, 2 1019, 3 1019, 4 1012, 5 1045, 6 1039, 7 1057}, :process 33, :time 79710282740, :txn-info {:start_ts 433284502919839786}, :index 33365}\n  T37 = {:type :ok, :f :read, :value {0 1053, 1 1062, 2 1023, 3 1020, 4 1017, 5 1049, 6 1045, 7 1063}, :process 26, :time 79985655731, :txn-info {:start_ts 433284502998482981}, :index 33485}\n  T38 = {:type :ok, :f :read, :value {0 1058, 1 1067, 2 1027, 3 1026, 4 1022, 5 1050, 6 1054, 7 1068}, :process 23, :time 80409522192, :txn-info {:start_ts 433284503103340582}, :index 33641}\n\nThen:\n  - T1 < T2, because T1 observed 4 = 1023, and T2 observed a higher value 1024.\n  - T2 < T3, because T2 observed 0 = 1062, and T3 observed a higher value 1069.\n  - T3 < T4, because T3 observed 0 = 1069, and T4 observed a higher value 1071.\n  - T4 < T5, because T4 observed 0 = 1071, and T5 observed a higher value 1077.\n  - T5 < T6, because T5 observed 0 = 1077, and T6 observed a higher value 1081.\n  - T6 < T7, because T6 observed 7 = 1085, and T7 observed a higher value 1089.\n  - T7 < T8, because T7 observed 7 = 1089, and T8 observed a higher value 1090.\n  - T8 < T9, because T8 observed 0 = 1087, and T9 observed a higher value 1094.\n  - T9 < T10, because T9 observed 1 = 1097, and T10 observed a higher value 1098.\n  - T10 < T11, because T10 observed 0 = 1094, and T11 observed a higher value 1096.\n  - T11 < T12, because T11 observed 0 = 1096, and T12 observed a higher value 1100.\n  - T12 < T13, because T12 observed 0 = 1100, and T13 observed a higher value 1107.\n  - T13 < T14, because T13 observed 0 = 1107, and T14 observed a higher value 1113.\n  - T14 < T15, because T14 observed 0 = 1113, and T15 observed a higher value 1118.\n  - T15 < T16, because T15 observed 0 = 1118, and T16 observed a higher value 1124.\n  - T16 < T17, because T16 observed 0 = 1124, and T17 observed a higher value 1129.\n  - T17 < T18, because T17 observed 0 = 1129, and T18 observed a higher value 1136.\n  - T18 < T19, because T18 observed 0 = 1136, and T19 observed a higher value 1141.\n  - T19 < T20, because T19 observed 0 = 1141, and T20 observed a higher value 1143.\n  - T20 < T21, because T20 observed 0 = 1143, and T21 observed a higher value 1153.\n  - T21 < T22, because T21 observed 6 = 1119, and T22 observed a higher value 1123.\n  - T22 < T23, because T22 observed 6 = 1123, and T23 observed a higher value 1124.\n  - T23 < T24, because T23 observed 0 = 1159, and T24 observed a higher value 1161.\n  - T24 < T25, because T24 observed 0 = 1161, and T25 observed a higher value 1163.\n  - T25 < T26, because T25 observed 4 = 1114, and T26 observed a higher value 1115.\n  - T26 < T27, because T26 observed 4 = 1115, and T27 observed a higher value 1116.\n  - T27 < T28, because T27 observed 0 = 1169, and T28 observed a higher value 1170.\n  - T28 < T29, because T28 observed 0 = 1170, and T29 observed a higher value 1172.\n  - T29 < T30, because T29 observed 5 = 955, and T30 observed a higher value 956.\n  - T30 < T31, because T30 observed 3 = 931, and T31 observed a higher value 951.\n  - T31 < T32, because T31 observed 3 = 951, and T32 observed a higher value 952.\n  - T32 < T33, because T32 observed 0 = 998, and T33 observed a higher value 1007.\n  - T33 < T34, because T33 observed 3 = 953, and T34 observed a higher value 1014.\n  - T34 < T35, because T34 observed 3 = 1014, and T35 observed a higher value 1015.\n  - T35 < T36, because T35 observed 0 = 1046, and T36 observed a higher value 1048.\n  - T36 < T37, because T36 observed 0 = 1048, and T37 observed a higher value 1053.\n  - T37 < T38, because T37 observed 0 = 1053, and T38 observed a higher value 1058.\n  - However, T38 < T1, because T38 observed 4 = 1022, and T1 observed a higher value 1023: a contradiction!"]},
  :timeline {:valid? true},
  :valid? false},
 :valid? false}


Analysis invalid! (ノಥ益ಥ)ノ ┻━┻

The full log can be accessed here.

4. What is your TiDB version? (Required)

master @ 683ba09

@zyguan zyguan added type/bug The issue is confirmed as a bug. sig/transaction SIG:Transaction severity/critical labels May 23, 2022
@ti-chi-bot ti-chi-bot added may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.0 may-affects-6.1 labels May 23, 2022
@cfzjywxk cfzjywxk removed the may-affects-4.0 This bug maybe affects 4.0.x versions. label May 23, 2022
@cfzjywxk cfzjywxk assigned cfzjywxk and ekexium and unassigned cfzjywxk May 24, 2022
@ekexium ekexium added affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 labels May 31, 2022
@ti-chi-bot ti-chi-bot removed may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. labels May 31, 2022
@cfzjywxk
Copy link
Contributor

#35076 could fix the issue.

@ekexium
Copy link
Contributor

ekexium commented Jun 16, 2022

#35076 updated client-go that fixes this.

@ekexium ekexium closed this as completed Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.1 severity/critical sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants