Node sync gets stuck towards the end of the sync process #2122

nazar-pc · 2023-10-14T21:12:06Z

I have hit this a few times:

2023-10-14T19:30:41.992407Z [Consensus] ⚙️  Syncing  8.1 bps, target=#40841 (11 peers), best: #38703 (0x5bd7…b64a), finalized #0 (0x1aed…2474), ⬇ 37.3kiB/s ⬆ 0.1kiB/s    
2023-10-14T19:30:46.992644Z [Consensus] ⚙️  Syncing  9.4 bps, target=#40842 (11 peers), best: #38750 (0xe55f…a7ad), finalized #0 (0x1aed…2474), ⬇ 6.3kiB/s ⬆ 25 B/s    
2023-10-14T19:30:51.999666Z [Consensus] ⚙️  Preparing 28.9 bps, target=#40842 (11 peers), best: #38895 (0xc683…2d26), finalized #0 (0x1aed…2474), ⬇ 23.7kiB/s ⬆ 21 B/s    
2023-10-14T19:30:57.000015Z [Consensus] ⚙️  Syncing  8.2 bps, target=#40844 (11 peers), best: #38936 (0x5b71…7823), finalized #0 (0x1aed…2474), ⬇ 10.9kiB/s ⬆ 0.5kiB/s    
2023-10-14T19:31:02.000378Z [Consensus] ⚙️  Syncing 12.4 bps, target=#40846 (11 peers), best: #38998 (0xb01d…9f35), finalized #0 (0x1aed…2474), ⬇ 10.8kiB/s ⬆ 0.1kiB/s    
2023-10-14T19:31:07.000639Z [Consensus] ⚙️  Syncing  7.4 bps, target=#40846 (11 peers), best: #39035 (0x713b…ec1d), finalized #0 (0x1aed…2474), ⬇ 3.2kiB/s ⬆ 0.1kiB/s    
2023-10-14T19:31:12.000965Z [Consensus] ⚙️  Syncing 19.8 bps, target=#40846 (11 peers), best: #39134 (0x5911…ed6e), finalized #0 (0x1aed…2474), ⬇ 3.2kiB/s ⬆ 0.1kiB/s    
2023-10-14T19:31:17.001370Z [Consensus] ⚙️  Syncing 10.0 bps, target=#40846 (11 peers), best: #39184 (0x32f6…2a66), finalized #0 (0x1aed…2474), ⬇ 1.2kiB/s ⬆ 38 B/s    
2023-10-14T19:31:22.004236Z [Consensus] ⚙️  Syncing  6.1 bps, target=#40847 (11 peers), best: #39215 (0x3df2…4543), finalized #0 (0x1aed…2474), ⬇ 4.4kiB/s ⬆ 68 B/s    
2023-10-14T19:31:27.004478Z [Consensus] ⚙️  Syncing 16.8 bps, target=#40848 (11 peers), best: #39299 (0xb63f…a9de), finalized #0 (0x1aed…2474), ⬇ 3.9kiB/s ⬆ 0.1kiB/s    
2023-10-14T19:31:32.006508Z [Consensus] ⚙️  Syncing  9.7 bps, target=#40848 (11 peers), best: #39348 (0x092f…fc92), finalized #0 (0x1aed…2474), ⬇ 1.2kiB/s ⬆ 38 B/s    
2023-10-14T19:31:37.006790Z [Consensus] ⚙️  Syncing 19.4 bps, target=#40850 (11 peers), best: #39445 (0xd533…a113), finalized #0 (0x1aed…2474), ⬇ 8.7kiB/s ⬆ 87 B/s    
2023-10-14T19:31:42.007087Z [Consensus] ⚙️  Syncing  8.2 bps, target=#40851 (11 peers), best: #39486 (0x0a2b…b320), finalized #0 (0x1aed…2474), ⬇ 6.2kiB/s ⬆ 0.2kiB/s    
2023-10-14T19:31:47.007517Z [Consensus] ⚙️  Syncing 11.0 bps, target=#40851 (11 peers), best: #39541 (0x549f…90e5), finalized #0 (0x1aed…2474), ⬇ 6.1kiB/s ⬆ 38 B/s    
2023-10-14T19:31:52.008270Z [Consensus] ⚙️  Syncing 12.0 bps, target=#40852 (11 peers), best: #39601 (0x7624…4604), finalized #0 (0x1aed…2474), ⬇ 5.5kiB/s ⬆ 76 B/s    
2023-10-14T19:31:57.010480Z [Consensus] ⚙️  Syncing 12.5 bps, target=#40853 (11 peers), best: #39664 (0x5372…6398), finalized #0 (0x1aed…2474), ⬇ 11.1kiB/s ⬆ 2.2kiB/s    
2023-10-14T19:32:02.010747Z [Consensus] ⚙️  Syncing 11.2 bps, target=#40854 (11 peers), best: #39720 (0x2e82…18af), finalized #0 (0x1aed…2474), ⬇ 7.3kiB/s ⬆ 0.1kiB/s    
2023-10-14T19:32:07.012347Z [Consensus] ⚙️  Syncing 14.1 bps, target=#40855 (11 peers), best: #39791 (0x88e5…da0f), finalized #0 (0x1aed…2474), ⬇ 9.1kiB/s ⬆ 0.3kiB/s    
2023-10-14T19:32:12.015871Z [Consensus] ⚙️  Syncing 12.7 bps, target=#40857 (11 peers), best: #39855 (0x63f1…870f), finalized #0 (0x1aed…2474), ⬇ 4.7kiB/s ⬆ 0.2kiB/s    
2023-10-14T19:32:17.017797Z [Consensus] ⚙️  Syncing  0.1 bps, target=#40857 (11 peers), best: #39856 (0xc7ac…1e31), finalized #0 (0x1aed…2474), ⬇ 2.1kiB/s ⬆ 44 B/s    
2023-10-14T19:32:22.018099Z [Consensus] ⚙️  Syncing 17.0 bps, target=#40858 (11 peers), best: #39941 (0xf72a…a213), finalized #0 (0x1aed…2474), ⬇ 4.9kiB/s ⬆ 44 B/s    
2023-10-14T19:32:27.018588Z [Consensus] ⚙️  Syncing  8.8 bps, target=#40859 (11 peers), best: #39985 (0xa48c…0a83), finalized #0 (0x1aed…2474), ⬇ 8.3kiB/s ⬆ 0.2kiB/s    
2023-10-14T19:32:32.018890Z [Consensus] ⚙️  Syncing  2.0 bps, target=#40860 (11 peers), best: #39995 (0x1671…0d1d), finalized #0 (0x1aed…2474), ⬇ 6.8kiB/s ⬆ 0.1kiB/s    
2023-10-14T19:32:37.019179Z [Consensus] ⚙️  Syncing  0.0 bps, target=#40860 (11 peers), best: #39995 (0x1671…0d1d), finalized #0 (0x1aed…2474), ⬇ 3.3kiB/s ⬆ 0    
2023-10-14T19:32:42.022384Z [Consensus] ⚙️  Syncing 10.9 bps, target=#40860 (11 peers), best: #40050 (0xf1a4…8041), finalized #0 (0x1aed…2474), ⬇ 2.0kiB/s ⬆ 89 B/s    
2023-10-14T19:32:47.022648Z [Consensus] ⚙️  Syncing  0.0 bps, target=#40861 (11 peers), best: #40050 (0xf1a4…8041), finalized #0 (0x1aed…2474), ⬇ 8.1kiB/s ⬆ 0.1kiB/s    
2023-10-14T19:32:52.025898Z [Consensus] ⚙️  Syncing 12.3 bps, target=#40861 (11 peers), best: #40112 (0x7d94…a657), finalized #0 (0x1aed…2474), ⬇ 4.9kiB/s ⬆ 0    
2023-10-14T19:32:57.028518Z [Consensus] ⚙️  Syncing 12.5 bps, target=#40862 (11 peers), best: #40175 (0xb94f…5108), finalized #0 (0x1aed…2474), ⬇ 10.3kiB/s ⬆ 0.4kiB/s    
2023-10-14T19:33:02.030402Z [Consensus] ⚙️  Syncing  0.3 bps, target=#40862 (11 peers), best: #40177 (0xed72…0921), finalized #0 (0x1aed…2474), ⬇ 6.1kiB/s ⬆ 51 B/s    
2023-10-14T19:33:07.030669Z [Consensus] ⚙️  Syncing  0.2 bps, target=#40862 (11 peers), best: #40178 (0x10f2…0cca), finalized #0 (0x1aed…2474), ⬇ 2.5kiB/s ⬆ 26 B/s    
2023-10-14T19:33:12.032394Z [Consensus] ⚙️  Syncing 24.9 bps, target=#40863 (11 peers), best: #40303 (0x9634…cad5), finalized #0 (0x1aed…2474), ⬇ 5.1kiB/s ⬆ 0.2kiB/s    
2023-10-14T19:33:17.032658Z [Consensus] ⚙️  Syncing  1.4 bps, target=#40865 (11 peers), best: #40310 (0x57f2…ec52), finalized #0 (0x1aed…2474), ⬇ 7.8kiB/s ⬆ 0.2kiB/s    
2023-10-14T19:33:22.034282Z [Consensus] ⚙️  Syncing 11.5 bps, target=#40865 (11 peers), best: #40368 (0xc886…600a), finalized #0 (0x1aed…2474), ⬇ 4.0kiB/s ⬆ 0    
2023-10-14T19:33:27.036154Z [Consensus] ⚙️  Syncing 12.5 bps, target=#40867 (11 peers), best: #40431 (0x5a93…b528), finalized #0 (0x1aed…2474), ⬇ 6.8kiB/s ⬆ 0.2kiB/s    
2023-10-14T19:33:32.036370Z [Consensus] ⚙️  Syncing  2.6 bps, target=#40869 (11 peers), best: #40444 (0x3b38…6c45), finalized #0 (0x1aed…2474), ⬇ 6.2kiB/s ⬆ 0.1kiB/s    
2023-10-14T19:33:37.036688Z [Consensus] ⚙️  Syncing 11.4 bps, target=#40869 (11 peers), best: #40501 (0x5b1c…080d), finalized #0 (0x1aed…2474), ⬇ 4.1kiB/s ⬆ 0    
2023-10-14T19:33:42.041878Z [Consensus] ⚙️  Syncing 11.7 bps, target=#40869 (11 peers), best: #40560 (0xb148…3688), finalized #0 (0x1aed…2474), ⬇ 1.2kiB/s ⬆ 89 B/s    
2023-10-14T19:33:47.044451Z [Consensus] ⚙️  Syncing 12.5 bps, target=#40869 (11 peers), best: #40623 (0x44bb…f382), finalized #0 (0x1aed…2474), ⬇ 2.2kiB/s ⬆ 51 B/s    
2023-10-14T19:33:52.046603Z [Consensus] ⚙️  Syncing 12.7 bps, target=#40870 (11 peers), best: #40687 (0x5a8a…9510), finalized #0 (0x1aed…2474), ⬇ 18.0kiB/s ⬆ 0.2kiB/s    
2023-10-14T19:33:57.046818Z [Consensus] ⚙️  Syncing  3.4 bps, target=#40871 (11 peers), best: #40704 (0xcea3…9035), finalized #0 (0x1aed…2474), ⬇ 4.9kiB/s ⬆ 0.2kiB/s    
2023-10-14T19:34:02.048380Z [Consensus] ⚙️  Syncing  9.3 bps, target=#40872 (11 peers), best: #40751 (0x462c…8cda), finalized #0 (0x1aed…2474), ⬇ 7.7kiB/s ⬆ 95 B/s    
2023-10-14T19:34:07.048742Z [Consensus] ⚙️  Syncing  2.4 bps, target=#40873 (11 peers), best: #40763 (0xb7f4…a8e9), finalized #0 (0x1aed…2474), ⬇ 9.0kiB/s ⬆ 0.2kiB/s    
2023-10-14T19:34:12.049651Z [Consensus] ⚙️  Syncing 10.4 bps, target=#40874 (11 peers), best: #40815 (0x6fd2…8756), finalized #0 (0x1aed…2474), ⬇ 7.6kiB/s ⬆ 0.2kiB/s    
2023-10-14T19:34:17.049969Z [Consensus] ⚙️  Syncing  4.4 bps, target=#40875 (11 peers), best: #40837 (0x03e1…045f), finalized #0 (0x1aed…2474), ⬇ 2.6kiB/s ⬆ 69 B/s    
2023-10-14T19:34:22.050343Z [Consensus] ⚙️  Syncing  1.0 bps, target=#40875 (11 peers), best: #40842 (0xb51d…9b6b), finalized #0 (0x1aed…2474), ⬇ 2.4kiB/s ⬆ 0    
2023-10-14T19:34:27.050614Z [Consensus] ⚙️  Syncing  0.0 bps, target=#40875 (11 peers), best: #40842 (0xb51d…9b6b), finalized #0 (0x1aed…2474), ⬇ 2.5kiB/s ⬆ 89 B/s    
2023-10-14T19:34:32.050948Z [Consensus] ⚙️  Syncing  0.0 bps, target=#40876 (11 peers), best: #40842 (0xb51d…9b6b), finalized #0 (0x1aed…2474), ⬇ 3.4kiB/s ⬆ 0.1kiB/s    
2023-10-14T19:34:37.051223Z [Consensus] ⚙️  Syncing  0.0 bps, target=#40876 (11 peers), best: #40842 (0xb51d…9b6b), finalized #0 (0x1aed…2474), ⬇ 1.1kiB/s ⬆ 0    
2023-10-14T19:34:42.051635Z [Consensus] ⚙️  Syncing  0.0 bps, target=#40876 (11 peers), best: #40842 (0xb51d…9b6b), finalized #0 (0x1aed…2474), ⬇ 2.5kiB/s ⬆ 89 B/s    
2023-10-14T19:34:47.051930Z [Consensus] ⚙️  Syncing  0.0 bps, target=#40876 (11 peers), best: #40842 (0xb51d…9b6b), finalized #0 (0x1aed…2474), ⬇ 2.4kiB/s ⬆ 51 B/s

2023-10-14T20:59:50.895531Z [Consensus] ⚙️  Syncing  0.0 bps, target=#41811 (11 peers), best: #39938 (0xe3b2…b00c), finalized #0 (0x1aed…2474), ⬇ 5.4kiB/s ⬆ 0.1kiB/s    
2023-10-14T20:59:55.896008Z [Consensus] ⚙️  Syncing  0.0 bps, target=#41813 (11 peers), best: #39938 (0xe3b2…b00c), finalized #0 (0x1aed…2474), ⬇ 6.5kiB/s ⬆ 0    
2023-10-14T21:00:00.896976Z [Consensus] ⚙️  Preparing  0.0 bps, target=#41814 (11 peers), best: #39938 (0xe3b2…b00c), finalized #0 (0x1aed…2474), ⬇ 1.1MiB/s ⬆ 0.5kiB/s    
2023-10-14T21:00:05.897407Z [Consensus] ⚙️  Preparing  0.0 bps, target=#41815 (11 peers), best: #39938 (0xe3b2…b00c), finalized #0 (0x1aed…2474), ⬇ 10.1kiB/s ⬆ 0.2kiB/s    
2023-10-14T21:00:10.898322Z [Consensus] ⚙️  Preparing 11.4 bps, target=#41816 (11 peers), best: #39995 (0x1671…0d1d), finalized #0 (0x1aed…2474), ⬇ 9.0kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:00:15.898510Z [Consensus] ⚙️  Preparing  0.0 bps, target=#41816 (11 peers), best: #39995 (0x1671…0d1d), finalized #0 (0x1aed…2474), ⬇ 5.1kiB/s ⬆ 0    
2023-10-14T21:00:20.903920Z [Consensus] ⚙️  Preparing 13.1 bps, target=#41818 (11 peers), best: #40061 (0x5a99…59eb), finalized #0 (0x1aed…2474), ⬇ 9.2kiB/s ⬆ 0.4kiB/s    
2023-10-14T21:00:25.906345Z [Consensus] ⚙️  Preparing 14.7 bps, target=#41819 (11 peers), best: #40135 (0xa5b6…81ee), finalized #0 (0x1aed…2474), ⬇ 8.6kiB/s ⬆ 44 B/s    
2023-10-14T21:00:30.906573Z [Consensus] ⚙️  Preparing  7.0 bps, target=#41820 (11 peers), best: #40170 (0xa0e9…39c0), finalized #0 (0x1aed…2474), ⬇ 21.7kiB/s ⬆ 0.3kiB/s    
2023-10-14T21:00:35.908228Z [Consensus] ⚙️  Preparing 10.7 bps, target=#41821 (11 peers), best: #40224 (0x3cc9…f08b), finalized #0 (0x1aed…2474), ⬇ 8.3kiB/s ⬆ 0.3kiB/s    
2023-10-14T21:00:40.913853Z [Consensus] ⚙️  Preparing 10.1 bps, target=#41822 (11 peers), best: #40275 (0x4325…ecfc), finalized #0 (0x1aed…2474), ⬇ 7.3kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:00:45.914134Z [Consensus] ⚙️  Preparing  7.0 bps, target=#41822 (11 peers), best: #40310 (0x57f2…ec52), finalized #0 (0x1aed…2474), ⬇ 14.2kiB/s ⬆ 0    
2023-10-14T21:00:50.917849Z [Consensus] ⚙️  Preparing 12.7 bps, target=#41825 (11 peers), best: #40374 (0xa713…149e), finalized #0 (0x1aed…2474), ⬇ 10.7kiB/s ⬆ 0.4kiB/s    
2023-10-14T21:00:55.924043Z [Consensus] ⚙️  Preparing 10.1 bps, target=#41826 (11 peers), best: #40425 (0x809b…f1d8), finalized #0 (0x1aed…2474), ⬇ 4.5kiB/s ⬆ 44 B/s    
2023-10-14T21:01:00.924287Z [Consensus] ⚙️  Preparing  7.8 bps, target=#41827 (11 peers), best: #40464 (0x983e…cc4d), finalized #0 (0x1aed…2474), ⬇ 14.0kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:01:05.924528Z [Consensus] ⚙️  Preparing 12.8 bps, target=#41827 (11 peers), best: #40528 (0x05c6…1101), finalized #0 (0x1aed…2474), ⬇ 6.3kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:01:10.924803Z [Consensus] ⚙️  Preparing 14.8 bps, target=#41828 (11 peers), best: #40602 (0xcfa4…b5ad), finalized #0 (0x1aed…2474), ⬇ 10.6kiB/s ⬆ 92 B/s    
2023-10-14T21:01:15.925194Z [Consensus] ⚙️  Preparing 13.6 bps, target=#41830 (11 peers), best: #40670 (0xf8b9…685c), finalized #0 (0x1aed…2474), ⬇ 11.0kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:01:20.925524Z [Consensus] ⚙️  Preparing  6.8 bps, target=#41831 (11 peers), best: #40704 (0xcea3…9035), finalized #0 (0x1aed…2474), ⬇ 13.2kiB/s ⬆ 0.4kiB/s    
2023-10-14T21:01:25.935631Z [Consensus] ⚙️  Preparing 13.1 bps, target=#41832 (11 peers), best: #40770 (0xc9b9…21f5), finalized #0 (0x1aed…2474), ⬇ 11.9kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:01:30.935924Z [Consensus] ⚙️  Preparing  3.0 bps, target=#41832 (11 peers), best: #40785 (0x9099…f498), finalized #0 (0x1aed…2474), ⬇ 8.8kiB/s ⬆ 0.2kiB/s    
2023-10-14T21:01:35.936183Z [Consensus] ⚙️  Syncing 10.4 bps, target=#41833 (11 peers), best: #40837 (0x03e1…045f), finalized #0 (0x1aed…2474), ⬇ 5.9kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:01:40.936455Z [Consensus] ⚙️  Syncing 14.4 bps, target=#41833 (11 peers), best: #40909 (0x46a0…f087), finalized #0 (0x1aed…2474), ⬇ 6.4kiB/s ⬆ 25 B/s    
2023-10-14T21:01:45.938094Z [Consensus] ⚙️  Syncing 13.1 bps, target=#41833 (11 peers), best: #40975 (0xfff2…8f38), finalized #0 (0x1aed…2474), ⬇ 3.3kiB/s ⬆ 0    
2023-10-14T21:01:50.938361Z [Consensus] ⚙️  Syncing 14.4 bps, target=#41833 (11 peers), best: #41047 (0xb7e0…3d46), finalized #0 (0x1aed…2474), ⬇ 2.3kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:01:55.938608Z [Consensus] ⚙️  Syncing  9.4 bps, target=#41835 (11 peers), best: #41094 (0xfa8e…89ca), finalized #0 (0x1aed…2474), ⬇ 14.8kiB/s ⬆ 0.2kiB/s    
2023-10-14T21:02:00.938889Z [Consensus] ⚙️  Syncing 15.0 bps, target=#41838 (11 peers), best: #41169 (0x1d41…ffdb), finalized #0 (0x1aed…2474), ⬇ 9.8kiB/s ⬆ 0.2kiB/s    
2023-10-14T21:02:05.941629Z [Consensus] ⚙️  Syncing 17.3 bps, target=#41839 (11 peers), best: #41256 (0xda1a…df9e), finalized #0 (0x1aed…2474), ⬇ 10.6kiB/s ⬆ 0.2kiB/s    
2023-10-14T21:02:10.941913Z [Consensus] ⚙️  Syncing  7.8 bps, target=#41840 (11 peers), best: #41295 (0xa232…4bca), finalized #0 (0x1aed…2474), ⬇ 5.7kiB/s ⬆ 70 B/s    
2023-10-14T21:02:15.942132Z [Consensus] ⚙️  Syncing 13.2 bps, target=#41841 (11 peers), best: #41361 (0x60ed…9562), finalized #0 (0x1aed…2474), ⬇ 8.5kiB/s ⬆ 60 B/s    
2023-10-14T21:02:20.942392Z [Consensus] ⚙️  Syncing 13.0 bps, target=#41842 (11 peers), best: #41426 (0xfb98…bf89), finalized #0 (0x1aed…2474), ⬇ 11.6kiB/s ⬆ 0.5kiB/s    
2023-10-14T21:02:25.942761Z [Consensus] ⚙️  Syncing 14.8 bps, target=#41843 (11 peers), best: #41500 (0x3e89…bb9f), finalized #0 (0x1aed…2474), ⬇ 4.4kiB/s ⬆ 44 B/s    
2023-10-14T21:02:30.943370Z [Consensus] ⚙️  Syncing 14.6 bps, target=#41844 (11 peers), best: #41573 (0x25fc…193e), finalized #0 (0x1aed…2474), ⬇ 8.8kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:02:35.947137Z [Consensus] ⚙️  Syncing 13.7 bps, target=#41844 (11 peers), best: #41642 (0x8fe2…e223), finalized #0 (0x1aed…2474), ⬇ 5.6kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:02:40.949799Z [Consensus] ⚙️  Syncing 13.1 bps, target=#41845 (11 peers), best: #41708 (0x24d1…b162), finalized #0 (0x1aed…2474), ⬇ 7.4kiB/s ⬆ 0.2kiB/s    
2023-10-14T21:02:45.950072Z [Consensus] ⚙️  Syncing 11.0 bps, target=#41847 (11 peers), best: #41763 (0x22ae…5bbe), finalized #0 (0x1aed…2474), ⬇ 7.3kiB/s ⬆ 63 B/s    
2023-10-14T21:02:50.950336Z [Consensus] ⚙️  Syncing  6.8 bps, target=#41849 (11 peers), best: #41797 (0x300a…920e), finalized #0 (0x1aed…2474), ⬇ 7.7kiB/s ⬆ 0.2kiB/s    
2023-10-14T21:02:55.950594Z [Consensus] ⚙️  Syncing  3.8 bps, target=#41850 (11 peers), best: #41816 (0x607e…1fe9), finalized #0 (0x1aed…2474), ⬇ 10.8kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:03:00.950834Z [Consensus] ⚙️  Syncing  1.6 bps, target=#41851 (11 peers), best: #41824 (0x6f38…585a), finalized #0 (0x1aed…2474), ⬇ 6.1kiB/s ⬆ 52 B/s    
2023-10-14T21:03:05.951075Z [Consensus] ⚙️  Syncing  1.4 bps, target=#41853 (11 peers), best: #41831 (0x9c6a…f33f), finalized #0 (0x1aed…2474), ⬇ 5.8kiB/s ⬆ 0.2kiB/s    
2023-10-14T21:03:10.951323Z [Consensus] ⚙️  Syncing  0.2 bps, target=#41856 (11 peers), best: #41832 (0xe575…f284), finalized #0 (0x1aed…2474), ⬇ 9.2kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:03:15.951578Z [Consensus] ⚙️  Syncing  0.0 bps, target=#41856 (11 peers), best: #41832 (0xe575…f284), finalized #0 (0x1aed…2474), ⬇ 4.1kiB/s ⬆ 0    
2023-10-14T21:03:20.951882Z [Consensus] ⚙️  Syncing  0.0 bps, target=#41857 (11 peers), best: #41832 (0xe575…f284), finalized #0 (0x1aed…2474), ⬇ 11.4kiB/s ⬆ 2.4kiB/s    
2023-10-14T21:03:25.952136Z [Consensus] ⚙️  Syncing  0.0 bps, target=#41858 (11 peers), best: #41832 (0xe575…f284), finalized #0 (0x1aed…2474), ⬇ 10.7kiB/s ⬆ 0.1kiB/s    
2023-10-14T21:03:30.952399Z [Consensus] ⚙️  Syncing  0.0 bps, target=#41859 (11 peers), best: #41832 (0xe575…f284), finalized #0 (0x1aed…2474), ⬇ 11.0kiB/s ⬆ 0.3kiB/s

In both cases node exits "Preparing" state back into "Syncing" (though I wouldn't expect that to happen) and then gets stuck at whatever height was the target at the moment of exiting "Preparing" state. Syncs successfully on restart and doesn't always get stuck in the first place.

Not yet sure what the reason is, will try to collect trace logs for Substrate sync.

The text was updated successfully, but these errors were encountered:

nazar-pc · 2023-10-14T21:44:48Z

Logs are filled with these (block number increases over time, same number for all peers, but always off by one):

2023-10-14T21:36:42.406577Z TRACE tokio-runtime-worker sync: [Consensus] Out of range for peer 12D3KooWM6vKrQ7n7AHinAddGQhQBNg4ctfM1PLHvNXqVPbfjiKa (42219 vs 42218)    
2023-10-14T21:36:42.406618Z TRACE tokio-runtime-worker sync: [Consensus] Out of range for peer 12D3KooWFroqEkrSuW3XNqrHM5e3n6fdTMM43vKxXSAkBzkd8Bo9 (42219 vs 42218)

Looks like off by one somewhere.

Full log:
node-debug-sync.log.zip

rahulksnv · 2023-10-18T07:20:41Z

@nazar-pc I didn't have much luck reproducing the issue (in close to 10 attempts). Also that the devnet chain is almost 90k blocks now makes the retries time consuming.

I also see these logs in my node frequently, but the node is able to sync fine. On reading the code, these logs should are expected when the peer.common == peer.best (i.e) all caught up. Think these can be muted separately.

Since you are able to see the issue frequently on your set up, can you help me with these:

I have some debug logs + fixes for failing tests in our upstream fork: autonomys/polkadot-sdk@c03d8ea (would be good to merge these in our fork as well, for any future debug). Can you please patch this in and give a few tries?

I already have a frontier branch that points to this version if you want to use: autonomys/frontier@86da7ec. (and point monorepo to these updated upstream + frontier commits)
Attach gdb when this happens, to rule out any issues like we had last week. I don't see anything bad in your logs, the sync just seems stuck for some reason

dariolina · 2023-10-18T12:18:21Z

FWIW my node has not unstuck yet so I can archive the whole thing if needed

rahulksnv · 2023-10-19T07:09:43Z

Think I understand the sequence now. While the node is syncing, a reorg needs to happen among the peers for the issue to happen. As the reorg is more likely near the tip, this would explain why nodes get stuck as they near the end of the sync process.

The node is at 93313 during the sync, which is close to the tip/completion.

Peers announce block 93314/0xc14b44f7e8fc8415292dd485c01933815f41d7cbd10ab65276d3a6f7a63d9184. Node downloads the block from one of them, tip moves to 93314, peer common number is updated: 93313 -> 93314

7308:2023-10-18T19:36:06.362861Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWLdPnDksKX43hEgV4N9EeJzm6x8wPWkHCvRaFBZvj4GBg: AnnouncementSummary { block_hash: 0xc14b44f7e8fc8415292dd485c01933815f41d7cbd10ab65276d3a6f7a63d9184, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true.
7309 2023-10-18T19:36:06.362920Z TRACE tokio-runtime-worker sync: [Consensus] New block request for 12D3KooWLdPnDksKX43hEgV4N9EeJzm6x8wPWkHCvRaFBZvj4GBg, (best:93314, common:93313) BlockRequest { id:       0, fields: HEADER | BODY | JUSTIFICATION, from: Hash(0xc14b44f7e8fc8415292dd485c01933815f41d7cbd10ab65276d3a6f7a63d9184), direction: Descending, max: Some(1) } 
7311:2023-10-18T19:36:06.376254Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWFroqEkrSuW3XNqrHM5e3n6fdTMM43vKxXSAkBzkd8Bo9: AnnouncementSummary { block_hash: 0xc14b44f7e8fc8415292dd485c01933815f41d7cbd10ab65276d3a6f7a63d9184, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true.
7314:2023-10-18T19:36:06.386013Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWJ216EP67meQU8fGJcYk1o4XcgUhyJMfDygaEjJstSQhJ: AnnouncementSummary { block_hash: 0xc14b44f7e8fc8415292dd485c01933815f41d7cbd10ab65276d3a6f7a63d9184, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true.
7317:2023-10-18T19:36:06.435406Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWMyNP2Pad4VuDgsiwXnmjrfvJXYWVRjn1QRmwwxnLBKq8: AnnouncementSummary { block_hash: 0xc14b44f7e8fc8415292dd485c01933815f41d7cbd10ab65276d3a6f7a63d9184, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true.
7320:2023-10-18T19:36:06.450356Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWMJEpocBKY6fZyjnb9zNwn3KuTHHTZ8hGdeDxBLR5C4Ri: AnnouncementSummary { block_hash: 0xc14b44f7e8fc8415292dd485c01933815f41d7cbd10ab65276d3a6f7a63d9184, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true.
7323:2023-10-18T19:36:06.452631Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWM6vKrQ7n7AHinAddGQhQBNg4ctfM1PLHvNXqVPbfjiKa: AnnouncementSummary { block_hash: 0xc14b44f7e8fc8415292dd485c01933815f41d7cbd10ab65276d3a6f7a63d9184, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true.
7326:2023-10-18T19:36:06.456067Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWLxpv7TaSBvJfRdfRahiy78Zoxg4eJteFWtRW99d5Fnv9: AnnouncementSummary { block_hash: 0xc14b44f7e8fc8415292dd485c01933815f41d7cbd10ab65276d3a6f7a63d9184, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true.
7332 2023-10-18T19:36:06.472946Z TRACE tokio-runtime-worker sync: [Consensus] BlockResponse 0 from 12D3KooWLdPnDksKX43hEgV4N9EeJzm6x8wPWkHCvRaFBZvj4GBg with 1 blocks  (93314)
7333 2023-10-18T19:36:06.472956Z TRACE tokio-runtime-worker sync: [Consensus] Reversing incoming block list
7334 2023-10-18T19:36:06.472995Z TRACE tokio-runtime-worker sync: [Consensus] 1 blocks ready for import
7335 2023-10-18T19:36:06.473000Z TRACE tokio-runtime-worker sync: [Consensus] Accepted 1 blocks (0xc14b44f7e8fc8415292dd485c01933815f41d7cbd10ab65276d3a6f7a63d9184) with origin NetworkInitialSync
7336 2023-10-18T19:36:06.473005Z TRACE tokio-runtime-worker sync: [Consensus] Updating peer 12D3KooWGU5VUqPyBUzg9kFKz7YGHNcYfupfCVJuHoaocVzMLN4B info, ours=93314, common=29238->29238, their best=29238
7337 2023-10-18T19:36:06.473011Z TRACE tokio-runtime-worker sync: [Consensus] Updating peer 12D3KooWMyNP2Pad4VuDgsiwXnmjrfvJXYWVRjn1QRmwwxnLBKq8 info, ours=93314, common=93313->93314, their best=93314
7338 2023-10-18T19:36:06.473017Z TRACE tokio-runtime-worker sync: [Consensus] Updating peer 12D3KooWLxpv7TaSBvJfRdfRahiy78Zoxg4eJteFWtRW99d5Fnv9 info, ours=93314, common=93313->93314, their best=93314
7339 2023-10-18T19:36:06.473021Z TRACE tokio-runtime-worker sync: [Consensus] Updating peer 12D3KooWFroqEkrSuW3XNqrHM5e3n6fdTMM43vKxXSAkBzkd8Bo9 info, ours=93314, common=93313->93314, their best=93314
7340 2023-10-18T19:36:06.473026Z TRACE tokio-runtime-worker sync: [Consensus] Updating peer 12D3KooWScSDffzjSAVftWfYs7326JgFUBx7qeaTwfmirSLouDLF info, ours=93314, common=29558->29558, their best=29558
7341 2023-10-18T19:36:06.473031Z TRACE tokio-runtime-worker sync: [Consensus] Updating peer 12D3KooWJVvRRaHjYzFuSUW3vaaMq71YUeb38qi9iXeazFv5HwPA info, ours=93314, common=93313->93313, their best=93313
7342 2023-10-18T19:36:06.473036Z TRACE tokio-runtime-worker sync: [Consensus] Updating peer 12D3KooWM6vKrQ7n7AHinAddGQhQBNg4ctfM1PLHvNXqVPbfjiKa info, ours=93314, common=93313->93314, their best=93314

Now there appears to be re-org in the network. Peers re-announce same block with different hash: 93314/0x20aba21707f04b462277815feb939c10952498511cead0fbd0c36a651f062f99

7350:2023-10-18T19:36:06.495300Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWR7N8P4sLiW8ptWhWgCQQ8ruZ1CgM2P27NJRLSEFTKLF1: AnnouncementSummary { block_hash: 0x20aba21707f04b462277815feb939c10952498511cead0fbd0c36a651f062f99, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true.
7365:2023-10-18T19:36:06.626703Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWKpf6e8CJ3sCEeFhLLns9kLWuPCipWwVdcEXrn8HJ4FWq: AnnouncementSummary { block_hash: 0x20aba21707f04b462277815feb939c10952498511cead0fbd0c36a651f062f99, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true. 
7367:2023-10-18T19:36:06.633286Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWMJEpocBKY6fZyjnb9zNwn3KuTHHTZ8hGdeDxBLR5C4Ri: AnnouncementSummary { block_hash: 0x20aba21707f04b462277815feb939c10952498511cead0fbd0c36a651f062f99, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true.
7369:2023-10-18T19:36:06.642785Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWM6vKrQ7n7AHinAddGQhQBNg4ctfM1PLHvNXqVPbfjiKa: AnnouncementSummary { block_hash: 0x20aba21707f04b462277815feb939c10952498511cead0fbd0c36a651f062f99, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true. 
7371:2023-10-18T19:36:06.645986Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWLxpv7TaSBvJfRdfRahiy78Zoxg4eJteFWtRW99d5Fnv9: AnnouncementSummary { block_hash: 0x20aba21707f04b462277815feb939c10952498511cead0fbd0c36a651f062f99, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true. 
7373:2023-10-18T19:36:06.648252Z TRACE tokio-runtime-worker sync: [Consensus] Block announcement validated successfully: from 12D3KooWMyNP2Pad4VuDgsiwXnmjrfvJXYWVRjn1QRmwwxnLBKq8: AnnouncementSummary { block_hash: 0x20aba21707f04b462277815feb939c10952498511cead0fbd0c36a651f062f99, number: 93314, parent_hash: 0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e, state: Some(Best) }. Local best: true.

The issue: the new block is on a different fork. So the common number should have moved back to the 93313 /0x2ad32e691b237aebea6d2d6fa90b3f1c28518b2c0846a72058c0e7ab5889a34e. Or should have triggered an ancestor search. But neither is done in the current sync code.

After this, we download blocks 93315 onwards from the new fork. But can't find the parent since 93314 is from the old fork, and we stall.

Am in the process of adding an unit test to verify the scenario, and look at potential fixes (also start with upstream folks with more concrete data)

nazar-pc · 2023-10-19T07:12:04Z

Thanks for the details!
Is this related to recent changes aimed at sync fix?
I just want to make sure we have a "plan B" in case we don't fix this today/tomorrow.

rahulksnv · 2023-10-19T07:23:43Z

With the sync fix, we don't try to import. Instead we see this message:
7531:2023-10-18T19:36:15.908789Z TRACE tokio-runtime-worker sync: [Consensus] Ready block parent not on chain: block_status=Ok(Unknown), best_queued=93314/0xc14b…9184, first_ready=93315/0xa13e…45d8, first_ready_parent=0x20ab…2f99

Without the fix, we would have tried to import that would have failed as parent not found, and restarted the sync process(the original symptom fixed by paritytech/polkadot-sdk#1812). Now instead we stall, for different reason though. If the common number is somehow restored, the sync should complete because of 1812.

I currently don't have concrete proposal for a fix, need to think more. One option is to trigger ancestry search, but the sync state machine is pretty obtuse, worried about breaking something else.

Worst case work around would be to just kill/restart the node binary if it gets stuck :-

rahulksnv · 2023-10-19T20:08:52Z

The last fix didn't cover all the cases. I am reverting it and going to think about it from a high level, instead of doing bandaids. The main issue can be seen here:

4787:2023-10-19T14:26:37.229090Z TRACE tokio-runtime-worker sync: [Consensus] Reorg detected on block announce from 12D3KooWMJEpocBKY6fZyjnb9zNwn3KuTHHTZ8hGdeDxBLR5C4Ri: 104381, hash 0xa0c4…2c0e -> 0x6b9f…b856, known_parent = true, AnnouncementSummary { block_hash: 0x6b9f932e4f734a4baa6946bdcf1572450d2ac9c5b9292bd1a88bf1dddac5b856, number: 104381, parent_hash: 0xb24afe5f38b11c1a207b9e8c7f41b0c65772258d5862fbea52b547ed1202fcb3, state: Some(Best) }
4792:2023-10-19T14:26:37.240025Z TRACE tokio-runtime-worker sync: [Consensus] Reorg detected on block announce from 12D3KooWMyNP2Pad4VuDgsiwXnmjrfvJXYWVRjn1QRmwwxnLBKq8: 104381, hash 0xa0c4…2c0e -> 0x6b9f…b856, known_parent = true, AnnouncementSummary { block_hash: 0x6b9f932e4f734a4baa6946bdcf1572450d2ac9c5b9292bd1a88bf1dddac5b856, number: 104381, parent_hash: 0xb24afe5f38b11c1a207b9e8c7f41b0c65772258d5862fbea52b547ed1202fcb3, state: Some(Best) }
4796:2023-10-19T14:26:37.265168Z TRACE tokio-runtime-worker sync: [Consensus] Reorg detected on block announce from 12D3KooWM6vKrQ7n7AHinAddGQhQBNg4ctfM1PLHvNXqVPbfjiKa: 104381, hash 0xa0c4…2c0e -> 0x6b9f…b856, known_parent = true, AnnouncementSummary { block_hash: 0x6b9f932e4f734a4baa6946bdcf1572450d2ac9c5b9292bd1a88bf1dddac5b856, number: 104381, parent_hash: 0xb24afe5f38b11c1a207b9e8c7f41b0c65772258d5862fbea52b547ed1202fcb3, state: Some(Best) }
4799:2023-10-19T14:26:37.265493Z TRACE tokio-runtime-worker sync: [Consensus] Reorg detected on block announce from 12D3KooWLxpv7TaSBvJfRdfRahiy78Zoxg4eJteFWtRW99d5Fnv9: 104381, hash 0xa0c4…2c0e -> 0x6b9f…b856, known_parent = true, AnnouncementSummary { block_hash: 0x6b9f932e4f734a4baa6946bdcf1572450d2ac9c5b9292bd1a88bf1dddac5b856, number: 104381, parent_hash: 0xb24afe5f38b11c1a207b9e8c7f41b0c65772258d5862fbea52b547ed1202fcb3, state: Some(Best) }
4803:2023-10-19T14:26:37.280099Z TRACE tokio-runtime-worker sync: [Consensus] Reorg detected on block announce from 12D3KooWKpf6e8CJ3sCEeFhLLns9kLWuPCipWwVdcEXrn8HJ4FWq: 104381, hash 0x6b9f…b856 -> 0xa0c4…2c0e, known_parent = true, AnnouncementSummary { block_hash: 0xa0c47672b8f1af0fd3d4d49b7c56ae9fc7e636deff697a7c127ab4a3b8582c0e, number: 104381, parent_hash: 0xb24afe5f38b11c1a207b9e8c7f41b0c65772258d5862fbea52b547ed1202fcb3, state: Some(Best) }
4806:2023-10-19T14:26:37.313727Z TRACE tokio-runtime-worker sync: [Consensus] Reorg detected on block announce from 12D3KooWR7N8P4sLiW8ptWhWgCQQ8ruZ1CgM2P27NJRLSEFTKLF1: 104381, hash 0xa0c4…2c0e -> 0x6b9f…b856, known_parent = true, AnnouncementSummary { block_hash: 0x6b9f932e4f734a4baa6946bdcf1572450d2ac9c5b9292bd1a88bf1dddac5b856, number: 104381, parent_hash: 0xb24afe5f38b11c1a207b9e8c7f41b0c65772258d5862fbea52b547ed1202fcb3, state: Some(Best) }
4829:2023-10-19T14:26:37.471679Z TRACE tokio-runtime-worker sync: [Consensus] Reorg detected on block announce from 12D3KooWFroqEkrSuW3XNqrHM5e3n6fdTMM43vKxXSAkBzkd8Bo9: 104381, hash 0x6b9f…b856 -> 0xa0c4…2c0e, known_parent = true, AnnouncementSummary { block_hash: 0xa0c47672b8f1af0fd3d4d49b7c56ae9fc7e636deff697a7c127ab4a3b8582c0e, number: 104381, parent_hash: 0xb24afe5f38b11c1a207b9e8c7f41b0c65772258d5862fbea52b547ed1202fcb3, state: Some(Best) }
4832:2023-10-19T14:26:37.480082Z TRACE tokio-runtime-worker sync: [Consensus] Reorg detected on block announce from 12D3KooWLdPnDksKX43hEgV4N9EeJzm6x8wPWkHCvRaFBZvj4GBg: 104381, hash 0x6b9f…b856 -> 0xa0c4…2c0e, known_parent = true, AnnouncementSummary { block_hash: 0xa0c47672b8f1af0fd3d4d49b7c56ae9fc7e636deff697a7c127ab4a3b8582c0e, number: 104381, parent_hash: 0xb24afe5f38b11c1a207b9e8c7f41b0c65772258d5862fbea52b547ed1202fcb3, state: Some(Best) }
4835:2023-10-19T14:26:37.480660Z TRACE tokio-runtime-worker sync: [Consensus] Reorg detected on block announce from 12D3KooWJ216EP67meQU8fGJcYk1o4XcgUhyJMfDygaEjJstSQhJ: 104381, hash 0x6b9f…b856 -> 0xa0c4…2c0e, known_parent = true, AnnouncementSummary { block_hash: 0xa0c47672b8f1af0fd3d4d49b7c56ae9fc7e636deff697a7c127ab4a3b8582c0e, number: 104381, parent_hash: 0xb24afe5f38b11c1a207b9e8c7f41b0c65772258d5862fbea52b547ed1202fcb3, state: Some(Best) }

The reorg happens and the peers are split in two groups (0xa0c4…2c0e, 0x6b9f…b856). From what I can tell, the sync logic to download blocks (BlockCollection) works only on block number ranges and requests ranges across the peers and tries to merge them into one sequence, even though the peers are on different chains. It seems to assume peers are on same chain, unless I am missing something.

nazar-pc · 2023-10-20T01:44:35Z

Are you able to create a test case with this information? Also can you create an upstream discussion about this now that you know more about it?

rahulksnv · 2023-10-20T01:55:30Z

yeah I can create a test the test(there are some existing tests in blocks.rs, we would need a repro test from the sync layer instead). And let me start something on the polkadot forum

rahulksnv · 2023-10-20T05:22:46Z

A simplified explanation of the issue:

Node downloads/imports block 93314 from peer 1
Reorg happens, peer 2 announces blocks 93314, 93315 from a different fork. Node downloads block 93315 from peer 2. Now 93314 and 93315 are from different branches
Because of Check for parent of first ready block being on chain paritytech/polkadot-sdk#1812, the downloaded 93315 is not imported as parent is not known
This would have been fine. 1812 expects 93314 to be re-downloaded from the new branch and blocks 93314, 93315 re-imported (the PR had unit tests that verified this works as expected). Except that an unrelated path(re-org related) advances the common number for the peers, which prevents 93314 to be re-imported.

So node remains stuck. The downloaded blocks are not imported, no new requests go out after this.

nazar-pc · 2023-10-20T05:38:12Z

So looks like numbers should be advanced individually and only if common block is actually common, not just at the same block height.

rahulksnv · 2023-10-20T05:52:22Z

yes, it goes back to the analysis above: #2122 (comment)

It seems to assume peers are on same chain

This works fine if no re-org (nodes are on same chain), so just block number based logic is good. In the presence of re-orgs, this breaks and run into issues.

Similarly, BlockCollection::needed_blocks() and friends seem to assume there is single chain and work with block number ranges. They mostly manage "holes" in block number range and try to download from peers that advertised the range, without worrying about the parent hash relationship

I am trying to add tests in the upstream to confirm this as well.

TL:DR - the sync layer should manage a tree of forks, where blocks are linked to parent blocks by hash. This understandably will be a bigger change

rahulksnv · 2023-10-20T05:53:45Z

Yesterday I tried a simpler approach to move the common number back on re-org, but that ran into other issues

…#1950) This reverts #1812 until we know why it causes syncing issues reported in autonomys/subspace#2122.

rahulksnv · 2023-10-20T15:08:51Z

So looks like numbers should be advanced individually and only if common block is actually common, not just at the same block height.

I was looking at something similar last two days, am going to continue with this today and see how it goes (and hopefully we can bring back the rolled back 1812)

nazar-pc · 2024-01-19T00:22:16Z

paritytech/polkadot-sdk#493 was re-resolved with paritytech/polkadot-sdk#2045, which landed in #2424, so I'm closing this for now

…ech#1812)" (paritytech#1950) This reverts paritytech#1812 until we know why it causes syncing issues reported in autonomys/subspace#2122.

nazar-pc added bug Something isn't working node Node (service library/node app) labels Oct 14, 2023

nazar-pc added this to the Protocol UX Improvements milestone Oct 14, 2023

nazar-pc assigned nazar-pc and rahulksnv Oct 14, 2023

rahulksnv mentioned this issue Oct 19, 2023

Sync issue related changes autonomys/polkadot-sdk#6

Closed

This was referenced Oct 20, 2023

Node can't sync with the network paritytech/polkadot-sdk#493

Closed

ChainSync requesting the same block multiple times paritytech/polkadot-sdk#1915

Open

Upgrade to Substrate fork version that reverts node sync fix #2138

Merged

dmitry-markin mentioned this issue Oct 20, 2023

Revert "Check for parent of first ready block being on chain (#1812)" paritytech/polkadot-sdk#1950

Merged

bkchr pushed a commit to paritytech/polkadot-sdk that referenced this issue Oct 20, 2023

Revert "Check for parent of first ready block being on chain (#1812)" (…

69c986f

…#1950) This reverts #1812 until we know why it causes syncing issues reported in autonomys/subspace#2122.

nazar-pc added the upstream Blocked by upstream label Oct 25, 2023

nazar-pc closed this as completed Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node sync gets stuck towards the end of the sync process #2122

Node sync gets stuck towards the end of the sync process #2122

nazar-pc commented Oct 14, 2023

nazar-pc commented Oct 14, 2023

rahulksnv commented Oct 18, 2023

dariolina commented Oct 18, 2023

rahulksnv commented Oct 19, 2023 •

edited

Loading

nazar-pc commented Oct 19, 2023

rahulksnv commented Oct 19, 2023

rahulksnv commented Oct 19, 2023

nazar-pc commented Oct 20, 2023 •

edited

Loading

rahulksnv commented Oct 20, 2023 •

edited

Loading

rahulksnv commented Oct 20, 2023

nazar-pc commented Oct 20, 2023

rahulksnv commented Oct 20, 2023

rahulksnv commented Oct 20, 2023

rahulksnv commented Oct 20, 2023

nazar-pc commented Jan 19, 2024

Node sync gets stuck towards the end of the sync process #2122

Node sync gets stuck towards the end of the sync process #2122

Comments

nazar-pc commented Oct 14, 2023

nazar-pc commented Oct 14, 2023

rahulksnv commented Oct 18, 2023

dariolina commented Oct 18, 2023

rahulksnv commented Oct 19, 2023 • edited Loading

nazar-pc commented Oct 19, 2023

rahulksnv commented Oct 19, 2023

rahulksnv commented Oct 19, 2023

nazar-pc commented Oct 20, 2023 • edited Loading

rahulksnv commented Oct 20, 2023 • edited Loading

rahulksnv commented Oct 20, 2023

nazar-pc commented Oct 20, 2023

rahulksnv commented Oct 20, 2023

rahulksnv commented Oct 20, 2023

rahulksnv commented Oct 20, 2023

nazar-pc commented Jan 19, 2024

rahulksnv commented Oct 19, 2023 •

edited

Loading

nazar-pc commented Oct 20, 2023 •

edited

Loading

rahulksnv commented Oct 20, 2023 •

edited

Loading