-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved Atlas shutdown and crash tolerance (#3082) #3086
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #3086 +/- ##
============================================
+ Coverage 31.26% 60.42% +29.16%
============================================
Files 298 240 -58
Lines 275135 131041 -144094
============================================
- Hits 86012 79178 -6834
+ Misses 189123 51863 -137260
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
src/net/atlas/db.rs
Outdated
const ATLASDB_SCHEMA_2: &'static [&'static str] = &[ | ||
r#" | ||
ALTER TABLE attachment_instances | ||
ADD status INTEGER NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised sqlite doesn't force you to provide a default value here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't. I removed "NOT NULL", because the default value defeats the point of not null here anyways. Also, added a test where the database actually undergoes migration.
.queue_attachment_instance(attachment_instance) | ||
.unwrap(); | ||
atlas_db | ||
.mark_attachment_instance_checked(attachment_instance, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there test coverage for simulating a crash and recovery? i.e. Is there a test somewhere that convinces us that if the node crashed and was restarted, the attachment system would discover that there were attachments queued but not checked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but I can try to add something along those lines. It won't be a "crash", but instead a stop-and-restart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in d3bf291: chainstate::coordinator::tests::atlas_stop_start
src/chainstate/coordinator/mod.rs
Outdated
Ok(_) => {} | ||
Err(e) => { | ||
error!("Atlas: error dispatching attachments {}", e); | ||
if let Some(atlas_db) = self.atlas_db.as_mut() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but doesn't this call to queue_attachment_instance()
happen outside of block-processing? Like, if a node processed a block and then crashed before reaching this code path, would it be able to enqueue the block's attachment data on restart?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to commit to the attachment data from the event stream from within StacksChainState::append_block()
, right before we commit to the Stacks block? Enqueuing attachment instances is idempotent, so re-doing them in the event the node crashes in-between saving them and committing the block should be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I'd like to do that in a separate PR. It's true that this change doesn't make the atlas queue entirely crash proof, but it removes the 2-thread dependency for the atlas queue, which makes the shutdown behavior much better already.
This PR is already changing a lot of how Atlas works, so I'd prefer to not also move atlas queueing logic into append_block
as part of the same changeset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did an initial pass. My biggest question is whether or not this PR allows the node to recover if it crashes while in the middle of the block. I can't convince myself that it does -- it seems this saves attachment events after the block that produced them is committed, which I think means that there's a window of time in which the block is saved, but the attachments are not. This would make it possible for the node to crash in such a way that it loses attachment events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm except +1 on Jude's comment on append_block
This needs to be looked at. |
Okay -- this PR should be ready for re-review! |
pub fn insert_uninstantiated_attachment_instance( | ||
/// Queue a new attachment instance, status will be set to "queued", | ||
/// and the is_available field set to false | ||
pub fn queue_attachment_instance( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you document which thread calls this function? Same for insert_initial_attachment_instance
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, however, I'm somewhat hesitant to document the caller from the called method. If atlas event processing is moved from the coordinator's purview to db::blocks
or whatever, that move can occur without ever touching this function, which could easily lead to a stale rustdoc here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in cb9f8dc
/// Check any queued attachment instances to see if we already have data for them, | ||
/// returning a vector of (instance, attachment) pairs for any of the queued attachments | ||
/// which already had the associated data | ||
/// Marks any processed attachments as checked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which thread calls this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is invoked in the thread managing the AttachmentDownloader. This is currently the P2P thread. Will add to the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in cb9f8dc
src/net/atlas/db.rs
Outdated
/// is completed, any checked instances are updated to `Checked`. | ||
pub enum AttachmentInstanceStatus { | ||
/// This variant indicates that the attachments instance has been written, | ||
/// but the downloader has not yet checked that the attachment matched |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "downloader" you mean the "downloader in the p2p thread", right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the AttachmentsDownloader
, will update the comment to make more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in cb9f8dc
@@ -5393,7 +5392,7 @@ impl PeerNetwork { | |||
// enqueue them. | |||
PeerNetwork::with_attachments_downloader(self, |network, attachments_downloader| { | |||
let mut known_attachments = attachments_downloader | |||
.enqueue_new_attachments(attachment_requests, &mut network.atlasdb, false) | |||
.check_queued_attachment_instances(&mut network.atlasdb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be moved to the relayer thread, as part of the process_network_result()
call? In general, I'd keep as much unbound disk I/O out of the p2p thread as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine if you don't do this in this PR. If not, then please open an issue and assign it to me to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open an issue for that -- it would entail moving the atlas downloader from the PeerNetwork struct into the Relayer struct.
@@ -47,7 +52,16 @@ use crate::types::chainstate::StacksBlockId; | |||
|
|||
use super::{AtlasConfig, Attachment, AttachmentInstance}; | |||
|
|||
pub const ATLASDB_VERSION: &'static str = "1"; | |||
pub const ATLASDB_VERSION: &'static str = "2"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think you could add some file-level documentation about the states an attachment goes through before it's either committed or rejected from the DB? I'm having a hard time understanding from the current comments what states an attachment can be in. At maximum, it looks like there are four, but not all of them make semantic sense:
- (unavailable, queued): The node got the attachment event, but has not yet received any data for it
- (available, queued): The node got the data, but not the corresponding attachment event?
- (unavailable, checked): ???
- (available, checked): The node saw the attachment event, has the data, and has authenticated it
If I'm understanding this right, you've just moved the attachment channel to the DB. The chains coordinator logs all attachment events it encounters straight into the DB, with the Queued
status. Later on, when the p2p thread receives attachment data, it authenticates the data against "queued" attachment events and if the data matches one or more of them, it (1) marks the attachment as checked, (2) stores the attachment data, and (3) marks it as available. Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added module level rustdocs (and more information to the other rustdocs) that talks more about this (cb9f8dc)
"Availability" and "checked" have little to do with each other (except that the availability field has no meaning until an attachment has been checked). "is available" means that the atlas content is already stored in the stacks-node's atlas database. "checked" means that the atlas downloader has acknowledged that a new content hash has been added to the system (i.e., a new attachment instance). So, "(unavailable, checked)" is indeed possible and part of the normal flow of the system: it just means that the atlas downloader hasn't downloaded the corresponding content yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kantai! LGTM
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Description
This PR replaces the rx/tx synchronized channel used to broadcast
AttachmentInstance
s from thechain-coordinator
thread to thep2p
thread with direct writes to the AtlasDB. This PR preserves thep2p
thread's behavior -- new attachment instances are written as "queued" by the coordinator, but aren't checked against extant attachment data until thep2p
thread passes over the queued instances. After this, the behavior is exactly the same as before, with new attachment notices being passed through the network result object.Applicable issues
Additional info (benefits, drawbacks, caveats)
Checklist