-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/stackerdb discovery #3552
Feat/stackerdb discovery #3552
Conversation
…orts as serving, so we can go and fetch state from them later
…their stacker DBs to each other
…work into a NeighborSet trait, and make the NeighborWalk implement this trait. Use the trait methods for sending and receiving messages, so the NeighborWalk state machine can focus more on reacting to neighbors and less on the low-level socket implementation details. Also, expand the test framework so that even-port nodes are stackerdb-aware nodes, and odd-port nodes are not, and make it so each topology test verifies that all stackerdb-aware nodes learn of each others' dbs
Codecov Report
@@ Coverage Diff @@
## develop #3552 +/- ##
===========================================
- Coverage 0.16% 0.16% -0.01%
===========================================
Files 305 310 +5
Lines 280694 281881 +1187
===========================================
Hits 469 469
- Misses 280225 281412 +1187
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these changes and the clearer refactoring in the network code. I had a few comments throughout, but I think the stackerdb changes need more testing coverage -- does the included testing handle updating stackerdb entries, deletion of a peer, inserting a peer on an existing slot, etc? I think those behaviors need unit test coverage.
src/net/db.rs
Outdated
Ok(ret) | ||
} | ||
|
||
/// Get stacker DBs for a slot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that referring to what's returned here as "stacker dbs" is kind of confusing. My understanding is that this is "identifying the set of StackerDBs which are tracked by the peers stored in the frontier slots used_slots
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bump on this -- I think the usage of "stacker db" in the code currently is very confusing. It's being used ambiguously to refer both the data being tracked and the contract identifier of the contract governing that data. There are many places where a comment says something like "get stacker dbs for a slot" but it means something more like "get the ContractIDs whose associated databases are being tracked by the peers stored in the slots used_slots
" (because a "slot" is actually itself a reference). The methods and fields in this file need better rustdocs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were a number of unaddressed comments from the spring, and I had a few more comments as well.
src/net/db.rs
Outdated
Ok(ret) | ||
} | ||
|
||
/// Get stacker DBs for a slot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bump on this -- I think the usage of "stacker db" in the code currently is very confusing. It's being used ambiguously to refer both the data being tracked and the contract identifier of the contract governing that data. There are many places where a comment says something like "get stacker dbs for a slot" but it means something more like "get the ContractIDs whose associated databases are being tracked by the peers stored in the slots used_slots
" (because a "slot" is actually itself a reference). The methods and fields in this file need better rustdocs.
Sorry, I accidentally hit the re-request review button by accident. The
only thing that changed here was I merged the stacker db messages branch
into it.
…On Mon, Jul 24, 2023, 5:37 PM Aaron Blankstein ***@***.***> wrote:
***@***.**** commented on this pull request.
There were a number of unaddressed comments from the spring, and I had a
few more comments as well.
------------------------------
In src/net/db.rs
<#3552 (comment)>
:
> + let mut ret = None;
+ loop {
+ match PeerDB::get_schema_version(tx) {
+ Ok(version) => {
+ if ret.is_none() {
+ ret = Some(version.clone());
+ }
+ if version == "1" {
+ PeerDB::apply_schema_2(tx)?;
+ } else if version == expected_version {
+ return Ok(ret.expect("unreachable"));
+ } else {
+ panic!("The schema version of the peer DB is invalid.")
+ }
+ }
+ Err(e) => panic!("Error obtaining the version of the peer DB: {:?}", e),
+ }
+ }
I'd recommend eliminating nesting, the necessity of a mutable option for a
non-optional return, and the use of loop:
⬇️ Suggested change
- let mut ret = None;
- loop {
- match PeerDB::get_schema_version(tx) {
- Ok(version) => {
- if ret.is_none() {
- ret = Some(version.clone());
- }
- if version == "1" {
- PeerDB::apply_schema_2(tx)?;
- } else if version == expected_version {
- return Ok(ret.expect("unreachable"));
- } else {
- panic!("The schema version of the peer DB is invalid.")
- }
- }
- Err(e) => panic!("Error obtaining the version of the peer DB: {:?}", e),
- }
- }
+ let begin_ver = PeerDB::get_schema_version(tx).expect("Error obtaining the version of the peer DB");
+ let mut version = begin_ver.clone();
+ while version != expected_version {
+ version = PeerDB::get_schema_version(tx).expect("Error obtaining the version of the peer DB");
+ if version == "1" {
+ PeerDB::apply_schema_2(tx)?;
+ } else {
+ panic!("The schema version of the Peer DB is invalid. Found = {}", version);
+ }
+ }
+ begin_ver
------------------------------
In src/net/db.rs
<#3552 (comment)>
:
> + )
+ .optional()?
+ .unwrap_or("1".to_string());
+ Ok(version)
+ }
+
+ fn apply_schema_2<'a>(tx: &Transaction<'a>) -> Result<(), db_error> {
+ debug!("Apply schema 2 to peer DB");
+ for row_text in PEERDB_SCHEMA_2 {
+ tx.execute_batch(row_text).map_err(db_error::SqliteError)?;
+ }
+ Ok(())
+ }
+
+ fn apply_schema_migrations<'a>(tx: &Transaction<'a>) -> Result<String, db_error> {
+ debug!("Apply any schema migrations");
This debug line seems spurious. This will echo on every open of the
PeerDB, correct? There's already other debug lines that echo whenever the
PeerDB opens.
------------------------------
In src/net/db.rs
<#3552 (comment)>
:
> @@ -872,12 +1016,36 @@ impl PeerDB {
Ok(allow_rows)
}
+ /// Insert or replace stacker DBs for a peer, given its slot
+ pub fn insert_or_replace_stacker_dbs<'a>(
+ tx: &mut Transaction<'a>,
+ slot: u32,
+ smart_contracts: &[ContractId],
+ ) -> Result<(), db_error> {
+ for cid in smart_contracts {
+ debug!("Add Stacker DB to slot {}: {}", slot, cid);
Can you consolidate this into a single debug line for the function call,
rather than 1 per stacker db.
------------------------------
In src/net/db.rs
<#3552 (comment)>
:
> @@ -1532,6 +1933,103 @@ mod test {
}
}
+ #[test]
+ fn test_insert_or_replace_stacker_dbs() {
For each of these new tests, can you add a comment describing what the
test is intending to cover? What is the scenario being tested? Just like a
sentence or two about what the test is supposed to be orchestrating would
be helpful when figuring out what situation are being covered.
------------------------------
In src/net/db.rs
<#3552 (comment)>
:
> @@ -363,6 +389,28 @@ const PEERDB_INITIAL_SCHEMA: &'static [&'static str] = &[
const PEERDB_INDEXES: &'static [&'static str] =
&["CREATE INDEX IF NOT EXISTS peer_address_index ON frontier(network_id,addrbytes,port);"];
+const PEERDB_SCHEMA_2: &'static [&'static str] = &[
+ r#"
+ CREATE TABLE stackerdb_peers(
+ smart_contract_id TEXT NOT NULL,
+ peer_slot INTEGER NOT NULL,
Okay, but slot *is* the primary key of the frontier table unless I'm
really misreading something.
------------------------------
In src/net/db.rs
<#3552 (comment)>
:
> + let slots = PeerDB::peer_slots(conn, network_id, addrbytes, port)?;
+ let mut ret = vec![];
+ for slot in &slots {
+ let used_slot = PeerDB::has_peer_at(conn, network_id, *slot)?;
I am now kind of more confused by this method.
A peer has multiple slots where it *could* land -- so PeerDB::peer_slots
returns 8 (or whatever) results. Then has_peer_at in like 1297 just
checks if there's *any* peer at the returned slot.
But in the case of a peer B getting stored in one of peer A's alternates,
wouldn't this method return a slot corresponding to a different peer? Why
is this method returning a vec of slots at all?
And looking at how this is being used, it seems like this would be
undesirable (i.e., Peer A being dropped doesn't mean that Peer B's
"stackerdb information" should be deleted, right?)
------------------------------
In src/net/db.rs
<#3552 (comment)>
:
> + let slots = PeerDB::peer_slots(conn, network_id, addrbytes, port)?;
+ let mut ret = vec![];
+ for slot in &slots {
+ let used_slot = PeerDB::has_peer_at(conn, network_id, *slot)?;
So, why couldn't this method be just the following?
SELECT slot FROM frontier WHERE network_id = ? AND port = ? AND addrbytes
= ?
The current method seems to be explicitly trying to include slots that the
input peer *isn't* occupying.
------------------------------
In src/net/db.rs
<#3552 (comment)>
:
> + network_id: u32,
+ addrbytes: &PeerAddress,
+ port: u16,
+ ) -> Result<Vec<u32>, db_error> {
+ let slots = PeerDB::peer_slots(conn, network_id, addrbytes, port)?;
+ let mut ret = vec![];
+ for slot in &slots {
+ let used_slot = PeerDB::has_peer_at(conn, network_id, *slot)?;
+ if used_slot {
+ ret.push(*slot);
+ }
+ }
+ Ok(ret)
+ }
+
+ /// Get stacker DBs for a slot
Bump on this -- I think the usage of "stacker db" in the code currently is
very confusing. It's being used ambiguously to refer both the data being
tracked and the contract identifier of the contract governing that data.
There are many places where a comment says something like "get stacker dbs
for a slot" but it means something more like "get the ContractIDs whose
associated databases are being tracked by the peers store in the slots
used_slots" (because a "slot" is actually itself a reference). The
methods and fields in this file need better rustdocs.
—
Reply to this email directly, view it on GitHub
<#3552 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQJK5GIUZGBC4Z5RAGL73XR3TKNANCNFSM6AAAAAAURREFT4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
… own file, and define traits which network state-machines can implement in order to gain this abstraction layer
… I/O into their own methods (first stab)
Thanks for all the feedback @kantai! I've addressed all the points. |
…ck how we use peer slots)
… a slow walk that's still ongoing
…ata we save from non-outbound walks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I really like this refactoring! Just had a few very minor comments.
stackslib/src/net/mod.rs
Outdated
|
||
let indexer = BitcoinIndexer::new_unit_test(&self.config.burnchain.working_dir); | ||
// let indexer = BitcoinIndexer::new_unit_test(&self.config.burnchain.working_dir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup!
stackslib/src/net/neighbors/comms.rs
Outdated
impl ToNeighborKey for NeighborAddress { | ||
fn to_neighbor_key(&self, network: &PeerNetwork) -> NeighborKey { | ||
// NOTE: PartialEq and Hash for NeighborKey ignore the low bits of peer version | ||
// and ignore network ID, and the CovnersationP2P ensures that we never even connect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// and ignore network ID, and the CovnersationP2P ensures that we never even connect | |
// and ignore network ID, and the ConversationP2P ensures that we never even connect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
stackslib/src/net/neighbors/db.rs
Outdated
} | ||
|
||
impl NeighborWalkDB for PeerDBNeighborWalk { | ||
/// implements get_fresh_random_neighbors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these "implements ..." comments really helpful? Are they just meant to be placeholders?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove -- they're an artifact from an earlier version of this code
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This PR implements neighbor discovery for Stacker DB-aware Stacks nodes. Stacks nodes report to each other via
StackerDBHandshakeAccept
which DBs they replicate (stored to theLocalPeer
table inPeerDB
; ultimately will be set by the config file). Stacks nodes remember this information in thePeerDB
as well, so that they can report which nodes replicate which DBs.This PR also does some necessary refactoring work to the
NeighborWalk
state machine to separate out the low-level code for sending and receiving messages via thePeerNetwork
from sending and reacting to messages. This will be used in a subsequent PR to simplify the implementation of the Stacker DB state replication logic.