Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ds_test hang - remove unsafe condition_variable #5119

Merged
merged 4 commits into from
Mar 17, 2023

Conversation

eddyashton
Copy link
Member

See #5117.

With the condition variable, it's possible for this sequence of events to happen:

Thread 20: Retrieves its thread ID, pushes this onto `assigned_ids`
Thread 0: Sees `assigned_ids` is of expected size, calls `all_done.notify_all()`
Thread 20: Calls `all_done.wait(lock);`

This results in a hang, as Thread 20 waits forever, and Thread 0 is paused trying to .join() Thread 20. We could I think fix this with another atomic bool acting as a latch, to prevent waiting on an already-notified cv. But the simpler fix is just to remove the cv entirely - its not actually needed. The only assertions we make are that all threads arrived and added a unique thread ID, so they can terminate immediately after they do so.

@eddyashton eddyashton requested a review from a team as a code owner March 17, 2023 09:41
@ccf-bot
Copy link
Collaborator

ccf-bot commented Mar 17, 2023

all_done_ds_test_hang@67149 aka 20230317.19 vs main ewma over 20 builds from 66707 to 67143

Click to see table

main

build_id build_number Commit latency factor tpcc_sgx_cft^ tpcc_sgx_cft_mem tpcc_virtual_cft^ ls_sgx_cft^ ls_sgx_cft_mem pi_ls_sgx_cft^ pi_ls_sgx_cft_mem ls_virtual_cft^ pi_ls_virtual_cft^ ls_jwt_sgx_cft^ ls_jwt_sgx_cft_mem pi_ls_jwt_sgx_cft^ pi_ls_jwt_sgx_cft_mem ls_jwt_virtual_cft^ pi_ls_jwt_virtual_cft^ ls_js_virtual_cft^ ls_js_sgx_cft^ ls_js_sgx_cft_mem ls_full_js_virtual_cft^ ls_js_jwt_virtual_cft^ ls_full_js_sgx_cft^ ls_full_js_sgx_cft_mem hist_sgx_cft^ ls_js_jwt_sgx_cft^ ls_js_jwt_sgx_cft_mem RB put (/s)^ CHAMP put (/s)^ RB get (/s)^ CHAMP get (/s)^
66707 20230310.35 0.806017 6272.22 8.21863e+07 17175.5 15798.5 1.50774e+07 16137.4 7.99955e+06 45843.1 47660.3 5525.84 1.45532e+07 5617.3 4.06739e+06 12749.9 13717.4 4493.62 1525.6 7.73741e+06 3778.62 3350.03 1335.38 7.99955e+06 43834.6 1250.26 6.95098e+06 836886 1.1786e+06 8.15303e+06 3.07734e+07
66731 20230310.41 0.798276 6274.17 8.29727e+07 17134.9 15449 1.50774e+07 16020.1 7.99955e+06 43944 44697.1 5504.32 1.45532e+07 5647 4.06739e+06 12378.4 12823.3 4487.92 1524.3 7.99955e+06 3934.72 3295.8 1337.43 7.21312e+06 46779.3 1253.02 6.95098e+06 834776 1.18105e+06 8.17219e+06 3.07e+07
66739 20230310.43 0.799713 5615.76 8.32349e+07 17055.3 15594 1.50774e+07 15965.7 7.99955e+06 45753.6 49098.3 5536.25 1.45532e+07 5601 4.06739e+06 12643.9 13775.4 4463.28 1505.78 7.73741e+06 3686.92 3386.87 1328.65 7.21312e+06 42718 1236.46 6.95098e+06 842833 1.1729e+06 8.15339e+06 3.04621e+07
66749 20230313.1 0.778726 6341.44 8.19242e+07 17222.1 15872.2 1.50774e+07 16169.4 7.99955e+06 45674.4 48590.2 5520.35 1.4291e+07 5704.6 4.06739e+06 12803.4 13077.2 4511.11 1530.6 7.73741e+06 3683.25 3455.13 1339.26 7.73741e+06 45670.9 1263.6 6.95098e+06 832964 1.17767e+06 8.15576e+06 3.08983e+07
66769 20230313.8 0.795253 6339.92 8.24484e+07 17203.4 15593.8 1.50774e+07 16121.2 7.99955e+06 43721.3 48115.6 5498.2 1.45532e+07 5661.5 4.06739e+06 12800.2 12734 4409.94 1532.9 7.73741e+06 3804.34 3327.25 1342.91 7.21312e+06 45304.5 1248.01 6.95098e+06 841179 1.17984e+06 8.15361e+06 3.06789e+07
66849 20230314.3 0.835903 6243.17 8.21863e+07 17107.4 15731.2 1.50774e+07 16111.5 7.99955e+06 43771.8 48885.9 5536.76 1.48153e+07 5670.9 4.06739e+06 12337.6 13922.9 4419.04 1537.93 7.73741e+06 3810 3338.36 1344.7 7.21312e+06 44349 1267.53 6.95098e+06 837472 1.17386e+06 8.15602e+06 3.12257e+07
66852 20230314.4 0.824445 6338.73 8.24484e+07 17088 15840.3 1.50774e+07 16107.1 7.99955e+06 47842.2 49659.1 5514.32 1.45532e+07 5674.8 4.06739e+06 12462.5 13924.3 4427.59 1542.24 7.73741e+06 3867.9 3267.74 1346.44 7.21312e+06 44459.6 1267.1 6.95098e+06 825281 1.18038e+06 8.14696e+06 3.08992e+07
66866 20230314.9 0.804424 6311.26 8.19242e+07 17105.8 15791.5 1.58639e+07 16092 7.99955e+06 43573.8 48319.9 5519.3 1.50774e+07 5610.4 4.06739e+06 12784.2 12754.1 4417.23 1543.44 7.73741e+06 3733.08 3261.38 1339.21 7.47526e+06 46027.2 1246.36 6.95098e+06 829924 1.18074e+06 8.17245e+06 3.15087e+07
66873 20230314.10 0.799788 6303.2 8.19242e+07 17114.8 15795 1.50774e+07 15995.1 7.99955e+06 46001 47571.9 5519.93 1.45532e+07 5700.7 4.06739e+06 12916.7 12643.7 4430.12 1537.74 7.73741e+06 3707.84 3280.86 1341.85 7.21312e+06 45759.8 1250.67 6.95098e+06 838655 1.1793e+06 7.9888e+06 2.95258e+07
66883 20230315.2 0.796321 6237.89 8.24484e+07 17012.4 15703.2 1.50774e+07 16126.6 7.99955e+06 43785.5 49833.3 5506.81 1.45532e+07 5671 4.06739e+06 12702.4 13029.7 4503.32 1532.87 7.73741e+06 3843.47 3290.84 1347.74 7.21312e+06 43848 1254.37 6.95098e+06 834390 1.17721e+06 8.15595e+06 3.07512e+07
66914 20230315.12 0.806417 6255.6 8.19242e+07 17209.6 15795.8 1.50774e+07 16055.6 7.99955e+06 47850.7 50189 5489.67 1.48153e+07 5639.6 4.06739e+06 13083.7 13041.6 4495.87 1531.4 7.73741e+06 3883.99 3230.45 1344.47 7.73741e+06 46180.9 1248.47 6.95098e+06 828215 1.17616e+06 8.17069e+06 3.08843e+07
66983 20230316.1 0.822344 6323.79 8.19242e+07 17119.2 15857.5 1.50774e+07 16158.1 7.99955e+06 45788.8 48878.2 5548.06 1.48153e+07 5707.7 4.06739e+06 12459.4 12848.7 4439.85 1544.14 7.73741e+06 3563.55 3291.97 1340.81 7.21312e+06 44085.3 1264.68 6.95098e+06 835443 1.18179e+06 8.16606e+06 3.05325e+07
66998 20230316.6 0.787371 6264.01 8.21863e+07 17105.9 15737.9 1.50774e+07 16111.4 7.99955e+06 41846.2 48084.8 5487.32 1.45532e+07 5677 4.06739e+06 12656.2 12920.2 4221.01 1530.54 7.73741e+06 3732 3269.47 1330.66 7.47526e+06 46144.6 1254.67 6.95098e+06 830724 1.17711e+06 8.15056e+06 3.08206e+07
67036 20230316.16 0.767618 6315.05 8.24484e+07 17151.8 15893.2 1.50774e+07 16159.5 8.2617e+06 45749.3 49934.1 5580 1.45532e+07 5675.6 4.06739e+06 12762.4 12744.5 4486.66 1543.16 7.73741e+06 3752.13 3286.38 1346.35 7.47526e+06 43720.4 1268.42 6.95098e+06 838895 1.18067e+06 8.04251e+06 3.09295e+07
67069 20230316.24 0.770005 6272.92 8.21863e+07 17251.9 15509 1.50774e+07 16043.6 7.99955e+06 43995.9 48813.1 5537.28 1.45532e+07 5631.9 4.06739e+06 12866.1 12943.9 4457.33 1529.92 7.73741e+06 3786.37 3241.07 1336.83 7.73741e+06 45418.5 1250.45 6.95098e+06 833989 1.18156e+06 8.15439e+06 3.111e+07
67084 20230316.28 0.785603 5605.25 8.29727e+07 17225.7 15448.4 1.50774e+07 16009.4 7.99955e+06 43789.5 45836.5 5477.3 1.48153e+07 5601.7 4.06739e+06 12494 11572.2 4417.75 1530.99 7.73741e+06 3644.74 3197.74 1325.38 7.47526e+06 43038.2 1243.02 6.95098e+06 831393 1.17716e+06 8.14538e+06 3.21426e+07
67092 20230317.2 0.813734 6293.02 8.21863e+07 17239.5 15821.1 1.50774e+07 16135.7 7.99955e+06 43694.4 48440.5 5560.02 1.45532e+07 5675.9 4.06739e+06 12710.9 13038.8 4591.84 1542.08 7.73741e+06 3846.19 3320.55 1345.09 7.73741e+06 44931.9 1253.23 6.95098e+06 830282 1.18286e+06 8.14969e+06 3.10826e+07
67111 20230317.8 0.801943 6319.49 8.24484e+07 17252.3 15770.6 1.50774e+07 16201.8 7.99955e+06 45903.9 47883.7 5554.29 1.45532e+07 5675.1 4.06739e+06 12806.9 13745.1 4443.63 1539.07 7.73741e+06 3799.07 3331.48 1343.45 7.47526e+06 44609.3 1260.88 6.95098e+06 833712 1.18179e+06 8.17333e+06 3.13346e+07
67131 20230317.15 0.78754 6317.69 8.27106e+07 17150.2 15711.6 1.50774e+07 16123.1 7.99955e+06 43683.2 46699.5 5554.1 1.48153e+07 5639 4.06739e+06 12790.1 12994.8 4442.35 1530.82 7.73741e+06 3708.22 3381.1 1332.6 7.21312e+06 42825.4 1249.31 6.95098e+06 842833 1.181e+06 8.15657e+06 3.0951e+07
67143 20230317.18 0.779458 6261.9 8.27106e+07 17255.6 15498.2 1.50774e+07 16157.5 7.99955e+06 43593 46843 5492.93 1.48153e+07 5649.8 4.06739e+06 12304.3 13603.2 4470.92 1533.05 7.73741e+06 3790.37 3287.98 1325.83 7.21312e+06 45783.3 1233.61 6.95098e+06 804942 1.17932e+06 8.17421e+06 3.08871e+07

all_done_ds_test_hang

build_id build_number Commit latency factor tpcc_virtual_cft^ ls_virtual_cft^ tpcc_sgx_cft^ tpcc_sgx_cft_mem pi_ls_virtual_cft^ ls_jwt_virtual_cft^ pi_ls_jwt_virtual_cft^ ls_js_virtual_cft^ ls_full_js_virtual_cft^ ls_sgx_cft^ ls_sgx_cft_mem pi_ls_sgx_cft^ pi_ls_sgx_cft_mem ls_js_jwt_virtual_cft^ ls_jwt_sgx_cft^ ls_jwt_sgx_cft_mem pi_ls_jwt_sgx_cft^ pi_ls_jwt_sgx_cft_mem hist_sgx_cft^ ls_js_sgx_cft^ ls_js_sgx_cft_mem ls_full_js_sgx_cft^ ls_full_js_sgx_cft_mem ls_js_jwt_sgx_cft^ ls_js_jwt_sgx_cft_mem RB put (/s)^ CHAMP put (/s)^ RB get (/s)^ CHAMP get (/s)^
67117 20230317.9 0.799533 17213.4 43680.8 6291.72 8.21863e+07 48438.1 12771.3 13213.4 4450.62 3722.29 15771.3 1.50774e+07 16135 7.99955e+06 3238.59 5509.39 1.45532e+07 5639.7 4.06739e+06 47277.9 1535.4 7.99955e+06 1343.62 7.21312e+06 1255.33 6.95098e+06 827274 1.17751e+06 8.1554e+06 3.10958e+07
67138 20230317.16 0.801824 17167.2 45682.6 6297.33 8.21863e+07 46506.9 12424.9 13037.8 4494.66 3655.94 15795.3 1.50774e+07 16157.2 7.99955e+06 3279.2 5557.75 1.48153e+07 5639.4 4.06739e+06 40879.9 1509.91 7.73741e+06 1330.71 7.21312e+06 1246.89 6.95098e+06 835414 1.16935e+06 8.15293e+06 3.07139e+07
67149 20230317.19 0.780577 17211.7 43785.7 6307.16 8.24484e+07 47140.7 12845.8 13907.6 4440.15 3770.83 15746.9 1.50774e+07 16120.5 7.99955e+06 3290.07 5585.39 1.45532e+07 5606.9 4.06739e+06 44094.6 1537.16 7.73741e+06 1334.71 7.73741e+06 1250.8 6.95098e+06 846428 1.18094e+06 8.15979e+06 3.12543e+07

images

@achamayou achamayou enabled auto-merge (squash) March 17, 2023 15:10
@achamayou achamayou merged commit 82ea63a into microsoft:main Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants