Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Treat blocks, that already downloaded, as synced #10864

Closed
wants to merge 30 commits into from

Conversation

grbIzl
Copy link
Collaborator

@grbIzl grbIzl commented Jul 8, 2019

This PR aims to fix the following scenario in the block sync:

  1. Node downloads the next block number N and starts importing it. Due to block's complexity or some configuration issues block's import takes more time (I've seen this issue, when import took 5-10 seconds).
  2. Meanwhile the next sync round has begun and Parity tries to sync block N once more time, obviously import of this block has no success with error AlreadyQueued. As a result the sync round has completed with 0 blocks synced
  3. As no new blocks were synced during sync round Parity starts descending in block number, treating this situation as error during blocks download: https://github.com/paritytech/parity-ethereum/blob/master/ethcore/sync/src/block_sync.rs#L429 and repeats sync for (already downloaded) blocks
  4. Attempt to sync block number N-1 returns the same error and causes attempt to download block N-2 and so on
  5. This downfall only finishes, when (if) block number N is imported and Parity catches up the current best block
  6. As a result huge DDOS of GetBlock requests outcomes from the node. This also can affect current block's verification and exacerbates the issue.

Suggested solution:
Treat downloaded blocks on step 2 and further, that exist already in the chain or in the queue, as actually synced on this round and (as result) do not start error handling procedure with blocks descending.

Comments:

soc1c and others added 9 commits April 2, 2019 09:37
* Reject crazy timestamps instead of truncating.

* fix(light cull): poll light cull instead of timer (#10559)

* fix(light cull): poll light cull instead of timer

* fix(grumbles): remove error + updated docs

* fix(on-demand request): `expect()` reason

* docs(remove misleading info)
* version: bump beta to 2.5.1

* fix(whisper expiry): current time + work + ttl (#10587)

* update bootnodes (#10595)

* config: update goerli bootnodes

* config: update kotti bootnodes

* adds rpc error message for --no-ancient-blocks (#10608)

* adds error message for --no-ancient-blocks, closes #10261

* Apply suggestions from code review

Co-Authored-By: seunlanlege <[email protected]>

* Constantinople HF on POA Core (#10606)

* Constantinople HF on POA Core

Plan Constantinople/St.Petersfork HF on POA Core network at block 8582254.
Original PR in POA repository: poanetwork/poa-chain-spec#110

* Remove extra empty line

* evm: add some mulmod benches (#10600)

* evm: add blockhash_mulmod bench

* evm: use num-bigint for mod ops

* Clique: zero-fill extradata when the supplied value is less than 32 bytes in length (#10605)

* Update kovan.json to switch validator set to POA Consensus Contracts (#10628)

* Fix publish docs (#10635)

* Fix publish docs

* this never should be forced, either way compiling previous versions will produce outdated docs

* fix array, var was moved to the group project global variables list

* Fix rinkeby petersburg fork (#10632)
* ci: publish docs debug (#10638)

* ci: backport missing diff from master
* version: bump beta to 2.5.2

* [CI] allow cargo audit to fail (#10676)

* [CI] allow cargo audit to fail

* [.gitlab-ci.yml] add a comment about cargo audit

* [Cargo.lock] cargo update -p protobuf

* Reset blockchain properly (#10669)

* delete BlockDetails from COL_EXTRA

* better proofs

* added tests

* PR suggestions

* new image (#10673)

* Update publishing (#10644)

* docker images are now built on k8s: test run

* copy check_sync.sh in build-linux job

* copy scripts/docker/hub/* in build-linux job

* removed cache var

* cleanup, no more nightly dockers

* cleanup in dockerfile

* some new tags

* removed sccsche debug log, cleanup

* no_gits, new artifacts dir, changed scripts. Test run.

* define version once

* one source for TRACK

* stop kovan onchain updates

* moved changes for two images to a new branch

* rename Dockerfile

* no need in libudev-dev

* enable lto for release builds (#10717)

* Use RUSTFLAGS to set the optimization level (#10719)

* Use RUSTFLAGS to set the optimization level

Cargo has a [quirk]() in how configuration settings are propagated when `cargo test` runs: local code respect the settings in `[profile.test]` but all dependencies use the `[profile.dev]` settings. Here we force `opt-level=3` for all dependencies.

* Remove unused profile settings

* Maybe like this?

* Turn off incremental compilation

* Remove colors; try again with overflow-checks on

* Use quiet CI machine

* Turn overflow checking back on

* Be explicit about what options we use

* Remove "quiet machine" override

* ethcore: enable ECIP-1054 for classic (#10731)

* config: enable atlantis on ethereum classic

* config: enable atlantis on morden classic

* config: enable atlantis on morden classic

* config: enable atlantis on kotti classic

* ethcore: move kotti fork block to 0xAEF49

* ethcore: move morden fork block to 0x4829BA

* ethcore: move classic fork block to 0x81B320

* remove trailing comma

* remove trailing comma

* fix chainspec

* ethcore: move classic fork block to 0x7fffffffffffffff
* ethcore/res: activate atlantis classic hf on block 8772000 (#10766)

* fix docker tags for publishing (#10741)

* fix: aura don't add `SystemTime::now()` (#10720)

This commit does the following:
- Prevent overflow in `verify_timestamp()` by not adding `now` to found faulty timestamp
- Use explicit `CheckedSystemTime::checked_add` to prevent potential consensus issues because SystemTime is platform
depedent
- remove `#[cfg(not(time_checked_add))]` conditional compilation

* Update version

* Treat empty account the same as non-exist accounts in EIP-1052 (#10775)

* DevP2p: Get node IP address and udp port from Socket, if not included in PING packet (#10705)

* get node IP address and udp port from Socket, if not included in PING packet

* prevent bootnodes from being added to host nodes

* code corrections

* code corrections

* code corrections

* code corrections

* docs

* code corrections

* code corrections

* Apply suggestions from code review

Co-Authored-By: David <[email protected]>

* Add a way to signal shutdown to snapshotting threads (#10744)

* Add a way to signal shutdown to snapshotting threads

* Pass Progress to fat_rlps() so we can abort from there too.

* Checking for abort in a single spot

* Remove nightly-only weak/strong counts

* fix warning

* Fix tests

* Add dummy impl to abort snapshots

* Add another dummy impl for TestSnapshotService

* Remove debugging code

* Return error instead of the odd Ok(())
Switch to AtomicU64

* revert .as_bytes() change

* fix build

* fix build maybe
* cargo update -p smallvec (#10822)

Fixes servo/rust-smallvec#148

* Update version to v2.5.3

Signed-off-by: Martin Pugh <[email protected]>
@grbIzl grbIzl added M4-core ⛓ Core client code / Rust. A0-pleasereview 🤓 Pull request needs code review. A1-onice 🌨 Pull request is reviewed well, but should not yet be merged. labels Jul 8, 2019
@grbIzl
Copy link
Collaborator Author

grbIzl commented Jul 8, 2019

I put this PR to onice in order to perform more testing. Please do not merge it before it's done.

@dvdplm
Copy link
Collaborator

dvdplm commented Jul 8, 2019

@tomusdrw Can you take a look at this please? What are the consequences of considering blocks that are currently getting processed as imported?

@grbIzl
Copy link
Collaborator Author

grbIzl commented Jul 9, 2019

This change breaks fork's handling logic in block sync. More accurate approach is required

@grbIzl grbIzl added A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. and removed A0-pleasereview 🤓 Pull request needs code review. labels Jul 9, 2019
Copy link
Collaborator

@tomusdrw tomusdrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Node downloads the next block number N and starts importing it. Due to block's complexity or some configuration issues block's import takes more time (I've seen this issue, when import took 5-10 seconds).
Meanwhile the next sync round has begun and Parity tries to sync block N once more time, obviously import of this block has no success with error AlreadyQueued. As a result the sync round has completed with 0 blocks synced

Since we can only import one block a the time, can't we have an atomic operation that actually makes sure that we shouldn't re-try importing it, cause it's already being imported?

Alternatively we could return a different error code than AlraedyQueued to distinguish this situation.

@grbIzl
Copy link
Collaborator Author

grbIzl commented Jul 9, 2019

Since we can only import one block a the time, can't we have an atomic operation that actually makes sure that we shouldn't re-try importing it, cause it's already being imported?

It already works so. Probably I was not able to emphasize this subtle difference between importing and syncing of blocks. Saying syncing, I meant downloading of block's header, body etc. Import happens for synced block and it is indeed atomic and we don't allow multiple re-imports of the block. The problem, that this PR addresses, is multiple re-sync of already synced blocks.

Alternatively we could return a different error code than AlraedyQueued to distinguish this situation.

I came to the nearly the same solution.

@ghost
Copy link

ghost commented Jul 11, 2019

As a general rule, Error-Codes should reflect reality. Thus, changing "AlreadyQueued" to the real error-code like e.g. "ImportOngoing", is the first step to take. Then act on this error-code, in an non-breaking manner.

Changing the logic, place atomic operations etc. should be avoided, because this can introduce unexpected behavior.

@ghost
Copy link

ghost commented Jul 12, 2019

(I've seen this issue, when import took 5-10 seconds

possibly related to #10085

s3krit and others added 7 commits August 12, 2019 18:55
  -  Fix cargo audit (#10921)
  - Add support for Energy Web Foundation's new chains (#10957)
  - Kaspersky AV whitelisting (#10919)
  - Avast whitelist script (#10900)
  - Docker images renaming (#10863)
  - Remove excessive warning (#10831)
  - Allow --nat extip:your.host.here.org (#10830)
  - When updating the client or when called from RPC, sleep should mean sleep (#10814)
  - added new ropsten-bootnode and removed old one (#10794)
  - ethkey no longer uses byteorder (#10786)
  - Do not drop the peer with None difficulty (#10772)
  - docs: Update Readme with TOC, Contributor Guideline. Update Cargo package descriptions (#10652)
* [trace] check mem diff within range (#11002)

* Update version (v2.5.7-stable)
* add more tx tests (#11038)
* Fix parallel transactions race-condition (#10995)
* Add blake2_f precompile (#11017)
* [trace] introduce trace failed to Ext (#11019)
* Edit publish-onchain.sh to use https (#11016)
* Fix deadlock in network-devp2p (#11013)
* EIP 1108: Reduce alt_bn128 precompile gas costs (#11008)
* xDai chain support and nodes list update (#10989)
* EIP 2028: transaction gas lowered from 68 to 16 (#10987)
* EIP-1344 Add CHAINID op-code (#10983)
* manual publish jobs for releases, no changes for nightlies (#10977)
* [blooms-db] Fix benchmarks (#10974)
* Verify transaction against its block during import (#10954)
* Better error message for rpc gas price errors (#10931)
* tx-pool: accept local tx with higher gas price when pool full (#10901)
* Fix fork choice (#10837)
* Cleanup unused vm dependencies (#10787)
* Fix compilation on recent nightlies (#10991)
* EIP 1884 Re-pricing of trie-size dependent operations  (#10992)
* Implement EIP-1283 reenable transition, EIP-1706 and EIP-2200  (#10191)
* ethcore/res: activate Istanbul on Ropsten, Görli, Rinkeby, Kovan (#11068)

* ethcore/res: activate Istanbul on Ropsten block 6485846

* ethcore/res: activate Istanbul on Goerli block 1561651

* ethcore/res: use hex values for Istanbul specs

* ethcore/res: fix trailing comma

* ethcore/res: be pedantic about EIP-1283 in Petersburg and Istanbul test specs

* ethcore/res: activate Istanbul on Rinkeby block 5435345

* ethcore/res: activate Istanbul on Kovan block 14111141

* ethcore/res: fix kovan istanbul number to 0xd751a5

* [json-spec] make blake2 pricing spec more readable (#11034)

* [json-spec] make blake2 pricing spec more readable

* [ethcore] fix compilation

* Manual backport of #11033
* Update CHANGELOG.md and version
parity/main.rs Outdated Show resolved Hide resolved
@grbIzl
Copy link
Collaborator Author

grbIzl commented Nov 15, 2019

PR is verified. Closing it in favor of the newly created #11264 with clear history.

@grbIzl grbIzl closed this Nov 15, 2019
@ordian ordian deleted the PreventBlocksReimport branch November 15, 2019 11:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A1-onice 🌨 Pull request is reviewed well, but should not yet be merged. A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants