-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync fail due to possible DB corruption #4036
Comments
Another corruption bug #4220 |
Hey @melekes I haven't really tried since then. Currently, I dont have an access to a powerful enough server to try again. So I will only be able to try out your suggestions within a few weeks. I have suspected that the issue is in LevelDB from my own investigation too or memory based |
#4630 should help too |
I'm going to close this. Please open an issue in https://github.com/syndtr/goleveldb/issues. Thank you! |
Reference to a Tendermint related issue
Current Behavior
I came across an issue while running a validator node on a Tendermint based chain.
The issue is that every so often, the system finds a mismatch in a block and crashes.
"Corruption on data-block checksum mismatch error".
All the obvious thing, like deleting DB, re-syncing, starting a new validator, new accounts, reinstalling dependencies, etc. have been tried.
The mistake keeps reoccurring.
The blocks are different each time, and the head block that the chain is synced up to, is much higher than the mismatch.
In fact the validator works perfectly for a while, before falling.
NOTE: OFTEN the chain keeps on syncing (6 - 12 hours after) if I leave it, it of course, crashes again thereafter
Expected Behavior
Chain should be syncing stably and constantly
Reproduction
Not sure if its possible to reproduce on purpose.
But it has been mentioned in one way or another in some places across other DB's i.e. BTC, ETH:
Log
This is how the mistake itself looks, where the chain crashes, although the block number can differ from time to time:
This is how the log looks after it tries to sync with the mismatch already in place:
(Different crush to the above, but it looks exactly the same)
Additional Information
System (local machine):
Some information from tendermint users (no one actually has a solution, I will open a similar issue on tendermint git):
The text was updated successfully, but these errors were encountered: