Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimize time between sortitions and block-acceptance #2989

Merged
merged 100 commits into from
Jan 27, 2022

Conversation

jcnelson
Copy link
Member

@jcnelson jcnelson commented Jan 9, 2022

This PR fixes #2986, #2944, and #2969.

The overarching goal of this PR is to improve chain quality by minimizing the time between when a sortition happens and when the miner can start producing the next block. This is achieved in three principal ways:

  • Aggressive block replication. This PR makes it so that peers directly push blocks and confirmed microblock streams to their outbound neighbors if they detect that they are missing them from their inventories. The original behavior was to send a BlocksAvailable or MicroblocksAvailable. Instead, these two messages will only be sent to inbound neighbors (whose inventories are not tracked).

  • Minimize time spent doing things besides block downloads. This PR removes the full inventory synchronization paths from the inventory state machine, which could cause a node to miss several sortitions' blocks every 12 hours (which had a detrimental effect on miners' ability to produce blocks on the chain tip). Instead, this PR makes it so that a node will only do a full inventory synchronization with its neighbors on boot-up, and from then on, it will only query the past two reward cycles' inventories.

  • Aggressively cancel stale RunTenure events and RunMicroblockTenure events. This PR updates the mining code-paths to check the current chain tip against each is7ued RunTenure, and drop the RunTenure if it refers to a now-stale chain tip. In addition, it checks the canonical Stacks chain tip after block assembly but before sending the block-commit, and cancels the operation if the canonical chain tip has changed. In addition, a similar set of checks are now done on RunMicroblockTenure. The reason for these changes is that the act of processing a RunTenure or RunMicroblockTenure directive in the relayer thread can take on the order of 30 seconds, during which time the chain tip they targeted can become stale. I believe this was the root cause of Multiple commits for btc block, missed target block #2969. With this change, the node's mock-miner reliably mines atop block-commits on the canonical chain.


In addition to making these improvements, this PR fixes three bugs, one of which is a show-stopper for 2.05.1.0.

  • A denial-of-service bug in the mempool synchronization logic. This must be merged into 2.05.0.1.0. Specifically, the mempool synchronization logic would do a full mempool scan on a query to /v2/mempool/query, which could take on the order of 500ms. This in turn would stall the network. We didn't notice it before because the effect was small during testing; ALEX's launch revealed that this is a problem. The fix is to make it so that the mempool synchronization happens in pages, where each page is only permitted to do a fixed number of mempool database queries.

  • A denial-of-service bug in the Bitcoin indexer whereby it would not be able to recover from a deep reorg. I actually discovered this first with my appchain MVP, and since noticed that this can happen in Bitcoin (and indeed, occasionally the bitcoind_forking_test flaps from it). Basically, there are cases where the reorg is deeper than the maximum number of blocks the indexer is allowed to synchronize in one go. If this happens, then the node gets stuck because it forgets to download the blocks in-between the reorg depth + maximum number height and the chain tip height. This PR fixes this by having the indexer drop the headers after reorg depth + maximum number, so the indexer is forced to re-fetch them and their blocks.

  • A lot of I/O paths are needlessly slow due to accidental database scans. This PR adds a set of indexes to the various databases that significantly improve boot-up and steady-state performance. In my testing, they cut boot-up time in half. I've also addressed [burnchain-download] download burnchain blocks in parallel to processing sortitions #2944, so that the node does not need to wait for all of a reward cycle's sortitions to be processed before downloading Stacks blocks. I already merged the highest-impact index update, but these ones also help. In addition, I've instrumented the test code so that if the BLOCKSTACK_DB_TRACE envar is set, the node will print out EXPLAIN QUERY PLAN statements in the log file that you can go and execute yourself against the test's databases. Using this new feature, I systematically went through each query that high I/O tests were doing and found queries that were doing (un-LIMIT-ed) table scans or creating temporary B-trees, and added indexes to prevent them.

    In order to achieve the above, I refactored neon_node.rs and runloop.rs in order to gather related state into the RunLoop struct, and to break up a lot of path-dependent spaghetti code into single-purpose methods. This should improve maintainability going forward.


I realize that there's a lot here, but time really is of the essence with the mempool being as full as it is. I have three nodes running with this right now (one from boot-up, two from existing chain state), and they're all doing fine.

…ain blocks while processing one reward cycle, so we can process stacks blocks concurrently as well.
…the next reward cycle to sync block data from by looking at the number of sortitions in the remote peer's block inventory, instead of the number of reward cycles in its PoX bit vector
…nce we bind to 0.0.0.0 but sometimes localhost resolves to ::1, leading to HTTP connection failures
…tbound peer does not know about, push it directly to them (don't advertize it via an *Available message). Add tests to verify that push-exclusive behavior works.
@jcnelson jcnelson changed the base branch from master to develop January 9, 2022 03:54
@codecov
Copy link

codecov bot commented Jan 10, 2022

Codecov Report

Merging #2989 (c395794) into develop (458cc9d) will decrease coverage by 0.14%.
The diff coverage is 61.95%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2989      +/-   ##
===========================================
- Coverage    82.68%   82.54%   -0.15%     
===========================================
  Files          242      242              
  Lines       194487   195380     +893     
===========================================
+ Hits        160821   161278     +457     
- Misses       33666    34102     +436     
Impacted Files Coverage Δ
src/burnchains/db.rs 94.83% <ø> (ø)
src/chainstate/stacks/db/mod.rs 86.78% <ø> (ø)
src/net/download.rs 41.25% <0.00%> (-0.09%) ⬇️
...t/stacks-node/src/burnchains/mocknet_controller.rs 77.05% <0.00%> (-1.71%) ⬇️
testnet/stacks-node/src/burnchains/mod.rs 52.38% <ø> (ø)
testnet/stacks-node/src/main.rs 0.62% <ø> (ø)
src/net/p2p.rs 58.46% <21.57%> (-3.13%) ⬇️
src/net/relay.rs 32.74% <28.52%> (-1.34%) ⬇️
src/net/rpc.rs 30.78% <37.09%> (+0.17%) ⬆️
src/net/mod.rs 67.36% <50.00%> (-0.12%) ⬇️
... and 59 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 458cc9d...c395794. Read the comment docs.

@kantai
Copy link
Member

kantai commented Jan 14, 2022

Any progress on this?

The network has been experiencing very high orphan rates, so I think the priority of resolving #2986 should be pretty high. There may be other contributing factors, so it would be useful to figure out how much this PR would improve the current situation.

@jcnelson
Copy link
Member Author

jcnelson commented Jan 15, 2022

I discovered a couple interesting things today when investigating why bootup was taking longer than I thought.

First, as you know, the chains-coordinator thread will process either a Stacks block, a burnchain block, or neither, but never both at the same time. What this means is that when the node is booting, there's no way to download and process burnchain blocks in parallel to processing Stacks blocks -- the coordinator serializes them regardless of what we do in the sync_with_indexer() method or the Neon run loop. While this PR does change the Neon run loop so that burnchain blocks are downloaded in parallel to Stacks block processing, this does not address this bottleneck.

Second, I discovered that as the bootup progresses, it becomes more and more likely that the chains-coordinator thread spends a lot of time in the handle_new_burnchain_block() loop, which queries multiple unprocessed burnchain blocks: https://github.com/blockstack/stacks-blockchain/blob/master/src/chainstate/coordinator/mod.rs#L570. In the worst cases, this stall can take on the order of 100 seconds, and the returned list of unprocessed burnchain blocks is on the order of 1,000 entries long. This loop returns multiple entries about 5% of the time during bootup, and they get longer as the node runs. This could be due to the fact that a slew of Stacks blocks getting processed at once can stall sortition processing, but not stall burnchain block downloads.

My node finished sync'ing yesterday; it spent most of its time processing blocks after height 30,000. I'll keep it running over the weekend. While I don't think we'll see any really interesting outlier behavior until we have a few thousand sortitions tracked, I think we should be able to determine whether or not some of the the other latency-lowering tactics in this PR work by comparing it to how develop currently works. However, it will be hard to gauge the effectiveness of one of the main changes in this PR -- having the node push blocks to peers that want them -- without running several public nodes with it applied. Currently, I'm only running this PR on a NAT'ed node. I'll try spinning up a public one as well tomorrow; even if only a few public nodes run this new pushing-behavior PR, we should start to see a reduction in sortition-to-download times.

I haven't had a chance yet to modify the miner run loop behavior to make it coalesce ProcessTenure and RunTenure requests. I'll get to that hopefully tomorrow.

@jcnelson
Copy link
Member Author

Okay, codecov is misbehaving, which is causing everything to break.

@kantai
Copy link
Member

kantai commented Jan 21, 2022

Okay, codecov is misbehaving, which is causing everything to break.

Hmm-- I'd guess possibly due to the Github org name change

Comment on lines 1908 to 1926
} else if let Some(next_txid) = next_last_randomized_txid_opt {
test_debug!("No rows returned for {}", &query.last_randomized_txid);

// no rows found
query.last_randomized_txid = next_txid;

// send the next page ID
query.tx_buf_ptr = 0;
query.tx_buf.clear();
query.corked = true;

test_debug!(
"No more txs in query after {:?}",
"Cork tx stream with next page {}",
&query.last_randomized_txid
);
break;
query
.last_randomized_txid
.consensus_serialize(&mut query.tx_buf)
.map_err(ChainstateError::CodecError)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to code coverage, this branch isn't covered -- is it possible to add coverage for this to the unit tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; will do

Comment on lines +2183 to +2190
if let Some(txs) = txs_opt {
debug!(
"{:?}: Mempool sync obtained {} transactions from mempool sync, but have more",
&self.local_peer,
txs.len()
);

return Ok(Some(txs));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if let Some(...) branch here appears to not be covered

Copy link
Member Author

@jcnelson jcnelson Jan 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that code is even reachable. I will just remove it instead. Nevermind; this is supposed to be exercised through pagination. Will update the mempool sync test to do so.

Copy link
Member Author

@jcnelson jcnelson Jan 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it's definitely getting exercised in test_mempool_sync_2_peers_paginated. I wonder why codecov didn't report it. Is it because it's an #[ignore]'ed test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- that'd be why -- I think the #[ignore] tests aren't getting executed by the code coverage unit tests.

src/net/p2p.rs Show resolved Hide resolved
src/net/p2p.rs Show resolved Hide resolved
@MaksimalistT
Copy link

Hi, @jcnelson, @kantai
my node is struggling with high frequency of invalid blocks and would be happy to increase number of txs mined per block to help the network and earn more in fees
willing to upgrade my node before official release
which branches are addressing this issues and safe enough to use?

Copy link
Member

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, thanks @jcnelson!

Copy link
Contributor

@pavitthrap pavitthrap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - just some minor comments

sql_query.to_string()
};

while let Some(part) = parts.next() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you do a replace here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you're asking? The code is constructing a new string because trying to replace ?X with "mock_arg" would in effect be doing the same thing.

@@ -2833,7 +2862,8 @@ impl PeerNetwork {
reward_cycle_finish
);

for reward_cycle in reward_cycle_start..reward_cycle_finish {
// go from latest to earliest reward cycle
for reward_cycle in (reward_cycle_finish..reward_cycle_start).rev() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to go from latest to earliest reward cycle, I think you would need to iterate from reward_cycle_start to reward_cycle_finish

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but reward_cycle_finish <= reward_cycle_start. The anti-entropy protocol works backwards, from latest reward cycle to earliest reward cycle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if reward_cycle_finish <= reward_cycle_start, wouldn't that make reward_cycle_start the latest reward cycle?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would. Here's an execution trace of some of the debug statements from a unit test:

$ grep -E 'Local blocks inventory|over reward cycles' /tmp/test.out
DEBG [1643228629.268898] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 6-5
DEBG [1643228629.273409] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 5 is BlocksInvData { bitlen: 5, block_bitvec: [31], microblocks_bitvec: [30] }
DEBG [1643228630.151754] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 6-5
DEBG [1643228630.155805] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 5 is BlocksInvData { bitlen: 5, block_bitvec: [3], microblocks_bitvec: [2] }
DEBG [1643228631.350438] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 5-4
DEBG [1643228631.353977] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 4 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228632.425082] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 5-4
DEBG [1643228632.428814] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 4 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228633.348980] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 4-3
DEBG [1643228633.352538] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 3 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228634.376591] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 4-3
DEBG [1643228634.380029] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 3 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228635.396177] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 3-2
DEBG [1643228635.399907] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 2 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228636.350704] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 3-2
DEBG [1643228636.354618] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 2 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228637.306919] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 2-1
DEBG [1643228637.310311] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 1 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228638.418983] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 2-1
DEBG [1643228638.423283] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 1 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228639.353688] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 1-0
DEBG [1643228639.357670] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 0 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228640.394823] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 1-0
DEBG [1643228640.399527] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 0 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228641.443450] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 6-5
DEBG [1643228641.452464] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 5 is BlocksInvData { bitlen: 5, block_bitvec: [31], microblocks_bitvec: [30] }
DEBG [1643228642.554711] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 6-5
DEBG [1643228642.560304] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 5 is BlocksInvData { bitlen: 5, block_bitvec: [15], microblocks_bitvec: [14] }
DEBG [1643228643.357928] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 5-4
DEBG [1643228643.363271] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 4 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228644.372366] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 5-4
DEBG [1643228644.376326] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 4 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228645.404950] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 4-3
DEBG [1643228645.408651] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 3 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228646.427323] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 4-3
DEBG [1643228646.432209] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 3 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228647.400470] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 3-2
DEBG [1643228647.405313] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 2 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228648.366515] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 3-2
DEBG [1643228648.371189] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 2 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228649.301985] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 2-1
DEBG [1643228649.306562] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 1 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228650.444922] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 2-1
DEBG [1643228650.449579] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 1 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228651.346471] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 1-0
DEBG [1643228651.352485] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 0 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228652.412420] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 1-0
DEBG [1643228652.417380] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4242)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 0 is BlocksInvData { bitlen: 5, block_bitvec: [0], microblocks_bitvec: [0] }
DEBG [1643228653.391679] [src/net/p2p.rs:2857] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: run protocol for 1 neighbors, over reward cycles 6-5
DEBG [1643228653.400161] [src/net/p2p.rs:2879] [ThreadId(2)] local.80000000://(bind=127.0.0.1:4240)(pub=None): AntiEntropy: Local blocks inventory for reward cycle 5 is BlocksInvData { bitlen: 5, block_bitvec: [31], microblocks_bitvec: [30] }

As you can see, the system starts from the highest reward cycle, looks for data to push, and on the next pass, starts from the next-highest reward cycle, and so on and so forth until it wraps around. The idea is that other nodes are most likely to be missing recent blocks, so the antientropy system should prioritize them first.

src/net/p2p.rs Outdated Show resolved Hide resolved
src/net/relay.rs Outdated Show resolved Hide resolved
src/net/relay.rs Outdated Show resolved Hide resolved
testnet/stacks-node/src/neon_node.rs Show resolved Hide resolved
testnet/stacks-node/src/neon_node.rs Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[net] high latency when fetching blocks in certain cases
5 participants