-
Notifications
You must be signed in to change notification settings - Fork 1.7k
2.2.2-beta segfaulting in libpthread and corrupting DB when started as daemon #9991
Comments
I just found the following messages in dmesg/syslog:
Just for clarification:
systemd service file/etc/systemd/system/parity-ethereum.service for reference:
|
Starting 2.2.1 (or 2.2.2 -- same result) again it looks like I'm now stuck at block #6097606
|
Ended up copying complete blockchain from a 2.1.6-stable client to get out of this mess. |
More SegfaultsDumped a clean copy of ~/.local/share/io.parity.ethereum from a 2.1.6-stable node to my test machine. No Problems: Running parity in foregroundRunning parity via Segfaults: Running parity as daemonRunning parity via
WTF?Why is parity printing debug messages to console when it's instructed to fork to background? It didn't do this is 2.2.1. At
After that a second parity process (PID 2047) is still running (in background) but does not consume CPU or IO nor is it writing to the logfile. Sending SIGTERM to that stray process results in another SEGFAULT being logged:
Something is f*cked up with threading/forking in 2.2.2 |
probably from #9946? cc @seunlanlege |
Thanks for the fix. I hope this doesn't sound picky but this could have been prevented in the stable release if it wouldn't have been backported to/released in beta and stable at the same time. Maybe it's better to have some fixes at least in one beta release before they are backported to the stable tree. This way my bug report in the beta release might have prevented the bug in the stable release. Just saying. Don't know whether this is feasible or not. |
Thanks for your feedback. Honestly, it's a zero-sum game; we backported it to stable because it was fixing a bug. That this introduced another bug, is unfortunate but can happen anytime. I'm hesitant if holding back fixes for the stable branch improves stability. |
UPDATE
Please see my comment #9991 (comment) for the cause of this mess. Looks like parity is segfaulting when command line argument
daemon
is present.I had 2.2.1-beta running on my test machine (where I always test new builds on).
2.2.1 was being shutdown via systemd but it seems came across #9807:
2.2.2 was started via systemd and it seems 2.2.2
did error outsegfaulted (see #9991 (comment)) immediately (Finishing work, please wait...):So systemd gave up because it had restarted parity too often (because parity segfaulted).
I then started parity by hand and after syncing some blocks I stopped it via ctrl+c:
According to #9991 (comment) (see the timestamps) this was not a clean shutdown eighter but a segfault, again.
Upon next start parity complains about block #6794995:
That's when shit hit the fan.
Upon next start parity is churning on block #6097606 (which is just ~697432 lower than just a couple of seconds before):
Even
eth_blockNumber
is reporting now6097606
instead of6794995
.I think my database is f*cked right now.
So is this a regression in 2.2.2 or a result of 2.2.1 not being shutdown cleanly because of #9807?
config.toml
The text was updated successfully, but these errors were encountered: