Keep existing blocks when restoring a Snapshot #8643

ngotchac · 2018-05-16T14:00:00Z

In order to restore existing blocks, it just iterates over the blocks of the current DB before swapping it with the snapshot one, from 1 (or first needed block) until all the blocks have been imported.

To test the feature, one can sync without warp-sync for a few thousand blocks, then restart with --warp-barrier

Blocks restoration seemed pretty fast (~15 seconds for 150_000 blocks), but if too slow this could be improved by skipping caches.

rphmeier · 2018-05-16T14:57:21Z

ethcore/src/snapshot/service.rs

+
+					// Writting changes to DB and logging every now and then
+					if block_number % 1_000 == 0 {
+						next_db.write_buffered(batch);


write_buffered will keep them in memory. we need to flush periodically

rphmeier · 2018-05-16T14:58:10Z

ethcore/src/snapshot/service.rs

+								let block_receipts = block_receipts.receipts;
+
+								next_chain.insert_unordered_block(&mut batch, &raw_block, block_receipts, parent_diff, false, true);
+								parent_diff = Some(diff);


parent_diff is meant to be a total difficulty

Oups, right...

rphmeier · 2018-05-16T15:09:14Z

ethcore/src/snapshot/service.rs

+		// Try to include every block that will need to be downloaded from the current chain
+		// Break when no more blocks are available from it.
+		match (next_chain_info.ancient_block_number, next_chain_info.first_block_number) {
+			(Some(next_ancient_block), Some(next_first_block)) if next_ancient_block + 1 < next_first_block => {


ancient_block_number is always going to be 1 (or 0, can't remember) after a restoration, isn't it?

the blocks we want to import from the old chain: anything available from genesis to first_block_number, but maintaining the invariant that only a single gap exists.

Yep that's right, it's supposed to always be 1. It does ensure that there is still only one gap of blocks.

rphmeier · 2018-05-16T15:13:49Z

ethcore/src/snapshot/service.rs

+					next_ancient_block, next_first_block,
+				);
+
+				let mut block_number = next_ancient_block + 1;


really needs to check that the first ancient block we pull from the old client has the right parent_hash.

rphmeier · 2018-05-16T15:15:22Z

ethcore/src/snapshot/service.rs

+						}
+					// Break if we already imported some blocks in the current batch and there
+					// are no more left
+					} else if parent_diff.is_some() {


why only break if parent_diff is Some? if there are no more blocks we can import it seems like it should always break the loop.

Yep that's right, didn't push this change yet.

rphmeier · 2018-05-16T15:16:06Z

ethcore/src/snapshot/service.rs

+				while block_number < next_first_block {
+					let chain = cur_chain.read();
+
+					if let Some(block_hash) = chain.block_hash(block_number) {


we don't have an exclusive lock on the chain's DB. this is racing with client's normal block import. (in practice this isn't a problem right now but let's not lean on assumptions from outside the module)

Should it lock the DB for the time of the restoration? The blocks won't change during it.

It should rather reference blocks by hash as opposed to number. Hash -> Block is stable, Number -> Block varies.

This is easiest when we import blocks in reverse because the parent_hash is always available. But then we should always make sure that next_first_block.parent_hash corresponds to the hash of a block that we import as its parent.

So could it be just a check that for each block, the parent's hash is the one expected, ie. the one we got from the previous block ? If there is a mismatch, then just stop right there.

rphmeier · 2018-05-16T16:46:22Z

Since GitHub marks that comment as hidden I'll continue here:

So could it be just a check that for each block, the parent's hash is the one expected, ie. the one we got from the previous block ? If there is a mismatch, then just stop right there.

sure, but that's a little weird because we know that those blocks are still probably in the DB, just that there was a reorg in the meantime

ngotchac · 2018-05-16T19:41:15Z

But do we keep blocks in DB that has been reorged?

rphmeier · 2018-05-16T19:59:10Z

yes. because there could be a reorg back at any point, there is no reason to discard them.

ngotchac · 2018-05-17T16:09:45Z

@rphmeier I update the PR so that it starts at the best available ancient block, and it iterates backwards from the parent's hash.
It actually also fixes an issue with resuming snapshot restoration with all the needed chunks (it would previously hang).

debris

This is a very important pr, but imo still requires some polishing :)

debris · 2018-05-25T11:26:56Z

ethcore/src/client/client.rs

@@ -845,6 +845,11 @@ impl Client {
 		*self.exit_handler.lock() = Some(Box::new(f));
 	}

+	/// Returns the chain reference
+	pub fn chain(&self) -> &RwLock<Arc<BlockChain>> {


It's a bad practice to expose &RwLock. There is no guarantee this will not deadlock

debris · 2018-05-25T11:30:50Z

ethcore/src/snapshot/service.rs

@@ -16,6 +16,7 @@

 //! Snapshot network service implementation.

+// use std::cmp;


should be removed

debris · 2018-05-25T11:33:54Z

ethcore/src/snapshot/service.rs

@@ -220,7 +221,7 @@ pub struct ServiceParams {
 	/// Usually "<chain hash>/snapshot"
 	pub snapshot_root: PathBuf,
 	/// A handle for database restoration.
-	pub db_restore: Arc<DatabaseRestore>,
+	pub client: Arc<Client>,


It's contradictory to @0x7CFE refactor of ethcore. We don't want any modules to require Client, but only the interface that is actually being used

debris · 2018-05-25T11:39:52Z

ethcore/src/snapshot/tests/service.rs

@@ -103,6 +102,9 @@ fn restored_is_equivalent() {

 #[test]
 fn guards_delete_folders() {
+	let gas_prices = vec![1.into(), 2.into(), 3.into(), 999.into()];
+	let client = generate_dummy_client_with_spec_and_data(Spec::new_null, 400, 5, &gas_prices);


iirc, this helper function generates client with 400 blocks. you should check in this test if they were actually migrated correctly

I'm not sure I understand. This test only tests if the guarded folders are deleted.

from #6350

When restoring from a snapshot parity should try and re-import ancient blocks existing in the current database starting from genesis.

We need a test for that ;)

Ah yeah sure, I thought you were commenting on the guards_delete_folders test!

debris · 2018-05-25T11:48:48Z

ethcore/src/blockchain/blockchain.rs

@@ -838,6 +827,80 @@ impl BlockChain {
 		}
 	}

+	/// Update the best ancient block to the given hash, after checking that
+	/// it's directly linked to the currently known best ancient block
+	pub fn update_best_ancient_block(&self, hash: &H256) {


This function is called only from a single place. Is it possible for hash to not be linked to any known ancient block? If yes, what does it mean? Should it be handled somehow? Currently the result of this function execution is unknown which makes it very difficult to debug

So this function will go from the given hash, and will ensure there is a link between the block at the given hash and the last know best ancient block. Thus, it only update the best ancient block if there is a link.

5chdn · 2018-10-02T09:04:07Z

Please reopen when ready

ngotchac · 2018-10-02T09:30:36Z

Rebased on master

5chdn · 2018-10-26T11:12:47Z

@debris @tomusdrw could you give this a final review?

@ngotchac sorry, could you rebase this again?

tomusdrw

Let's get this merged and tested in wild, it's too long overdue already.

tomusdrw · 2018-10-29T21:46:07Z

ethcore/src/blockchain/blockchain.rs

+			let mut block_hash = *hash;
+			let mut is_linked = false;
+
+			loop {


^^ @ngotchac is this addressed?

tomusdrw · 2018-10-29T21:52:46Z

ethcore/sync/src/chain/supplier.rs

-			if let Some(mut receipts_bytes) = io.chain().encoded_block_receipts(&rlp.val_at::<H256>(i)?) {
+			if let Some(receipts) = io.chain().block_receipts(&rlp.val_at::<H256>(i)?) {
+				let mut receipts_bytes = ::rlp::encode(&receipts).into_vec();
+			// if let Some(mut receipts_bytes) = io.chain().encoded_block_receipts(&rlp.val_at::<H256>(i)?) {


remove please?

ascjones

Finally took the time to go through this. Looks good!

ethcore/src/snapshot/service.rs

Co-Authored-By: ngotchac <[email protected]>

ngotchac · 2018-11-19T08:33:15Z

🎉 !!!

* Rename db_restore => client * First step: make it compile! * Second step: working implementation! * Refactoring * Fix tests * PR Grumbles * PR Grumbles WIP * Migrate ancient blocks interating backward * Early return in block migration if snapshot is aborted * Remove RwLock getter (PR Grumble I) * Remove dependency on `Client`: only used Traits * Add test for recovering aborted snapshot recovery * Add test for migrating old blocks * Fix build * PR Grumble I * PR Grumble II * PR Grumble III * PR Grumble IV * PR Grumble V * PR Grumble VI * Fix one test * Fix test * PR Grumble * PR Grumbles * PR Grumbles II * Fix tests * Release RwLock earlier * Revert Cargo.lock * Update _update ancient block_ logic: set local in `commit` * Update typo in ethcore/src/snapshot/service.rs Co-Authored-By: ngotchac <[email protected]>

* version: bump beta to 2.2.2 * Add experimental RPCs flag (#9928) * WiP * Enable experimental RPCs. * Keep existing blocks when restoring a Snapshot (#8643) * Rename db_restore => client * First step: make it compile! * Second step: working implementation! * Refactoring * Fix tests * PR Grumbles * PR Grumbles WIP * Migrate ancient blocks interating backward * Early return in block migration if snapshot is aborted * Remove RwLock getter (PR Grumble I) * Remove dependency on `Client`: only used Traits * Add test for recovering aborted snapshot recovery * Add test for migrating old blocks * Fix build * PR Grumble I * PR Grumble II * PR Grumble III * PR Grumble IV * PR Grumble V * PR Grumble VI * Fix one test * Fix test * PR Grumble * PR Grumbles * PR Grumbles II * Fix tests * Release RwLock earlier * Revert Cargo.lock * Update _update ancient block_ logic: set local in `commit` * Update typo in ethcore/src/snapshot/service.rs Co-Authored-By: ngotchac <[email protected]> * Adjust requests costs for light client (#9925) * PIP Table Cost relative to average peers instead of max peers * Add tracing in PIP new_cost_table * Update stat peer_count * Use number of leeching peers for Light serve costs * Fix test::light_params_load_share_depends_on_max_peers (wrong type) * Remove (now) useless test * Remove `load_share` from LightParams.Config Prevent div. by 0 * Add LEECHER_COUNT_FACTOR * PR Grumble: u64 to u32 for f64 casting * Prevent u32 overflow for avg_peer_count * Add tests for LightSync::Statistics * Fix empty steps (#9939) * Don't send empty step twice or empty step then block. * Perform basic validation of locally sealed blocks. * Don't include empty step twice. * prevent silent errors in daemon mode, closes #9367 (#9946) * Fix a deadlock (#9952) * Update informant: - decimal in Mgas/s - print every 5s (not randomly between 5s and 10s) * Fix dead-lock in `blockchain.rs` * Update locks ordering * Fix light client informant while syncing (#9932) * Add `is_idle` to LightSync to check importing status * Use SyncStateWrapper to make sure is_idle gets updates * Update is_major_import to use verified queue size as well * Add comment for `is_idle` * Add Debug to `SyncStateWrapper` * `fn get` -> `fn into_inner` * ci: rearrange pipeline by logic (#9970) * ci: rearrange pipeline by logic * ci: rename docs script * fix docker build (#9971) * Deny unknown fields for chainspec (#9972) * Add deny_unknown_fields to chainspec * Add tests and fix existing one * Remove serde_ignored dependency for chainspec * Fix rpc test eth chain spec * Fix starting_nonce_test spec * Improve block and transaction propagation (#9954) * Refactor sync to add priority tasks. * Send priority tasks notifications. * Propagate blocks, optimize transactions. * Implement transaction propagation. Use sync_channel. * Tone down info. * Prevent deadlock by not waiting forever for sync lock. * Fix lock order. * Don't use sync_channel to prevent deadlocks. * Fix tests. * Fix unstable peers and slowness in sync (#9967) * Don't sync all peers after each response * Update formating * Fix tests: add `continue_sync` to `Sync_step` * Update ethcore/sync/src/chain/mod.rs Co-Authored-By: ngotchac <[email protected]> * fix rpc middlewares * fix Cargo.lock * json: resolve merge in spec * rpc: fix starting_nonce_test * ci: allow nightl job to fail

ngotchac added 5 commits May 16, 2018 14:25

Rename db_restore => client

739212d

First step: make it compile!

4028d5f

Second step: working implementation!

5bdbe0e

Refactoring

c3719d2

Fix tests

c2cd1c6

ngotchac requested review from tomusdrw and rphmeier May 16, 2018 14:00

rphmeier reviewed May 16, 2018

View reviewed changes

PR Grumbles

2581509

5chdn added this to the 1.12 milestone May 16, 2018

ngotchac added 3 commits May 17, 2018 12:12

Merge branch 'master' into ng-keep-ancient-blocks

bda7b4d

PR Grumbles WIP

0713f7f

Migrate ancient blocks interating backward

cc0e6ba

ngotchac removed the A1-onice 🌨 Pull request is reviewed well, but should not yet be merged. label May 17, 2018

ngotchac added 2 commits May 18, 2018 17:50

Merge branch 'master' into ng-keep-ancient-blocks

074cc47

Early return in block migration if snapshot is aborted

2613611

debris previously requested changes May 25, 2018

View reviewed changes

5chdn closed this Oct 2, 2018

Merge branch 'master' into ng-keep-ancient-blocks

b114c65

ngotchac reopened this Oct 2, 2018

ngotchac removed the A3-stale 🍃 Pull request did not receive any updates in a long time. No review needed at this stage. Close it. label Oct 2, 2018

Merge branch 'master' into ng-keep-ancient-blocks

1101055

5chdn added the B9-blocker 🚧 This pull request blocks the next release from happening. Use only in extreme cases. label Oct 26, 2018

5chdn modified the milestones: 2.2, 2.3 Oct 29, 2018

tomusdrw approved these changes Oct 29, 2018

View reviewed changes

ngotchac added 5 commits October 30, 2018 13:59

Merge branch 'master' into ng-keep-ancient-blocks

9775ee7

Release RwLock earlier

e490a89

Merge branch 'master' into ng-keep-ancient-blocks

1242628

Revert Cargo.lock

cc99b16

Update _update ancient block_ logic: set local in commit

835d0d9

tomusdrw added A8-looksgood 🦄 Pull request is reviewed well. and removed A0-pleasereview 🤓 Pull request needs code review. labels Nov 9, 2018

tomusdrw approved these changes Nov 9, 2018

View reviewed changes

ascjones approved these changes Nov 14, 2018

View reviewed changes

ethcore/src/snapshot/service.rs Outdated Show resolved Hide resolved

Update typo in ethcore/src/snapshot/service.rs

1f43ce8

Co-Authored-By: ngotchac <[email protected]>

5chdn merged commit 9475a2e into master Nov 17, 2018

5chdn deleted the ng-keep-ancient-blocks branch November 17, 2018 23:06

ngotchac mentioned this pull request Nov 22, 2018

Fix a deadlock #9952

Merged

5chdn added the B1-patch-beta 🕷🕷 label Nov 26, 2018

5chdn mentioned this pull request Nov 27, 2018

Backports for beta 2.2.2 #9976

Merged

12 tasks

		@@ -16,6 +16,7 @@

		//! Snapshot network service implementation.

		// use std::cmp;

Keep existing blocks when restoring a Snapshot #8643

Keep existing blocks when restoring a Snapshot #8643

Conversation

ngotchac commented May 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ngotchac May 16, 2018 • edited Loading

Choose a reason for hiding this comment

rphmeier May 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier commented May 16, 2018

ngotchac commented May 16, 2018

rphmeier commented May 16, 2018 • edited Loading

ngotchac commented May 17, 2018

debris left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

5chdn commented Oct 2, 2018

ngotchac commented Oct 2, 2018

5chdn commented Oct 26, 2018

tomusdrw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ascjones left a comment

Choose a reason for hiding this comment

ngotchac commented Nov 19, 2018

ngotchac commented May 16, 2018 •

edited

Loading

ngotchac May 16, 2018 •

edited

Loading

rphmeier May 16, 2018 •

edited

Loading

rphmeier commented May 16, 2018 •

edited

Loading