Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Parity RCP hangs after intensive tx generation #10321

Closed
Pzixel opened this issue Feb 11, 2019 · 10 comments
Closed

Parity RCP hangs after intensive tx generation #10321

Pzixel opened this issue Feb 11, 2019 · 10 comments
Labels
Z3-stale 🍃 Issue is in principle valid, but it is not relevant anymore or can not reproduced.
Milestone

Comments

@Pzixel
Copy link
Contributor

Pzixel commented Feb 11, 2019

  • Parity Ethereum version: 2.0.6
  • Operating system: Linux
  • Installation: docker
  • Fully synchronized: yes
  • Network: private (POA, chain.json specification)
  • Restarted: yes

For some reason after several hours of intensive block generations parity hangs for no reason.

Friday night I ran my batcher, that just call some contract in batches (50 items in batch).
Here are my logs:

2019-02-08 23:35:49.024 +03:00 [Information] "51,38%" done. Elapsed = 09:49:30.2509107.
2019-02-08 23:35:58.938 +03:00 [Information] "51,40%" done. Elapsed = 09:49:40.1648875.
2019-02-08 23:36:08.794 +03:00 [Information] "51,41%" done. Elapsed = 09:49:50.0207078.
2019-02-08 23:36:18.833 +03:00 [Information] "51,43%" done. Elapsed = 09:50:00.0597905.
2019-02-08 23:36:28.819 +03:00 [Information] "51,44%" done. Elapsed = 09:50:10.0459107.
2019-02-08 23:36:38.562 +03:00 [Information] "51,46%" done. Elapsed = 09:50:19.7880563.
2019-02-08 23:36:48.819 +03:00 [Information] "51,47%" done. Elapsed = 09:50:30.0456974.
2019-02-08 23:36:58.739 +03:00 [Information] "51,49%" done. Elapsed = 09:50:39.9653453.
2019-02-08 23:37:08.770 +03:00 [Information] "51,50%" done. Elapsed = 09:50:49.9966959.
2019-02-08 23:37:18.834 +03:00 [Information] "51,52%" done. Elapsed = 09:51:00.0605550.
2019-02-08 23:37:29.025 +03:00 [Information] "51,53%" done. Elapsed = 09:51:10.2512043.
2019-02-08 23:37:38.840 +03:00 [Information] "51,55%" done. Elapsed = 09:51:20.0662434.
2019-02-08 23:37:48.780 +03:00 [Information] "51,56%" done. Elapsed = 09:51:30.0061202.
2019-02-08 23:37:58.858 +03:00 [Information] "51,57%" done. Elapsed = 09:51:40.0840885.
2019-02-09 00:10:17.189 +03:00 [Information] "51,59%" done. Elapsed = 10:23:58.4150452.
2019-02-09 01:00:36.062 +03:00 [Information] "51,60%" done. Elapsed = 11:14:17.2886432.
2019-02-09 01:52:39.829 +03:00 [Information] "51,62%" done. Elapsed = 12:06:21.0557069.
2019-02-09 02:19:17.703 +03:00 [Information] "51,63%" done. Elapsed = 12:32:58.9298274.
2019-02-09 03:05:46.144 +03:00 [Information] "51,65%" done. Elapsed = 13:19:27.3706997.
2019-02-09 03:50:14.702 +03:00 [Information] "51,66%" done. Elapsed = 14:03:55.9286172.
2019-02-09 04:42:49.453 +03:00 [Information] "51,68%" done. Elapsed = 14:56:30.6793864.
2019-02-09 06:10:58.173 +03:00 [Information] "51,69%" done. Elapsed = 16:24:39.3998215.
2019-02-09 06:17:52.391 +03:00 [Information] "51,71%" done. Elapsed = 16:31:33.6170246.
2019-02-09 06:58:17.406 +03:00 [Information] "51,72%" done. Elapsed = 17:11:58.6323074.
2019-02-09 07:30:30.453 +03:00 [Information] "51,74%" done. Elapsed = 17:44:11.6792151.
2019-02-09 09:00:30.000 +03:00 [Information] "51,75%" done. Elapsed = 19:14:11.2265639.
2019-02-09 11:05:01.001 +03:00 [Information] "51,77%" done. Elapsed = 21:18:42.2271033.

As you can see, it worked fine for almost 10 hours. My setup is 5 sec so it took 10 sec (or two blocks) to insert the entire batch.

Then things get weirder, becasue now it takes almost an hour to insert the same amout of data in blockchain (I used "retry forever" policy with 30 sec awaiting between calls).

When I go to parity logs I see:

2019-02-11 10:01:39 UTC  DEBUG txqueue  [0xf4a91e7363d5565ee7066105061d925f305df2564de6de9aedbe2106bf9c9aa6] Sender: 0x006d…6ecd, nonce: 473022, gasPrice: 0, gas: 1000000000, value: 0, dataLen: 1092))
2019-02-11 10:01:40 UTC IO Worker #3 DEBUG txqueue  Re-computing pending set for block: 24997
2019-02-11 10:01:40 UTC  WARN parity_rpc::v1::impls::eth  Fallback to BlockId::Latest
2019-02-11 10:01:41 UTC  WARN parity_rpc::v1::impls::eth  Fallback to BlockId::Latest
2019-02-11 10:01:41 UTC  DEBUG txqueue  [0x099a391efb83c6bce2e146ca26ad3f04364e03b5d2b87762a1612ab0ccb592d2] Added to the pool.
2019-02-11 10:01:41 UTC  DEBUG txqueue  [0x099a391efb83c6bce2e146ca26ad3f04364e03b5d2b87762a1612ab0ccb592d2] Sender: 0x006d…6ecd, nonce: 473023, gasPrice: 0, gas: 1000000000, value: 0, dataLen: 1220))
2019-02-11 10:01:41 UTC  DEBUG txqueue  [0x1cbb7ec03eb3ced7902e316ee021fc851516119914653b4c69b85dd0ce94e294] Added to the pool.
2019-02-11 10:01:41 UTC  DEBUG txqueue  [0x1cbb7ec03eb3ced7902e316ee021fc851516119914653b4c69b85dd0ce94e294] Sender: 0x006d…6ecd, nonce: 473024, gasPrice: 0, gas: 1000000000, value: 0, dataLen: 1220))
2019-02-11 10:01:45 UTC IO Worker #1 DEBUG txqueue  Re-computing pending set for block: 24997
2019-02-11 10:02:00 UTC IO Worker #3 INFO import     0/25 peers    195 KiB chain   65 MiB db  0 bytes queue 448 bytes sync  RPC:  0 conn,  106 req/s,  164 µs
2019-02-11 10:02:02 UTC  WARN parity_rpc::v1::impls::eth  Fallback to BlockId::Latest
2019-02-11 10:02:03 UTC  WARN parity_rpc::v1::impls::eth  Fallback to BlockId::Latest
2019-02-11 10:02:03 UTC  WARN parity_rpc::v1::impls::eth  Fallback to BlockId::Latest
2019-02-11 10:02:03 UTC  DEBUG txqueue  [0x1091eda8d977494754b639254d9983ecec0271915dbd52f5d5143f55e1990e66] Added to the pool.
2019-02-11 10:02:03 UTC  DEBUG txqueue  [0x1091eda8d977494754b639254d9983ecec0271915dbd52f5d5143f55e1990e66] Sender: 0x006d…6ecd, nonce: 473025, gasPrice: 0, gas: 1000000000, value: 0, dataLen: 996))
2019-02-11 10:02:03 UTC  DEBUG txqueue  [0x6249bbd30eeb3cdcb59e650f2cb50bf0aeab6d390de4ec5e6d29715a483f5441] Added to the pool.
2019-02-11 10:02:03 UTC  DEBUG txqueue  [0x6249bbd30eeb3cdcb59e650f2cb50bf0aeab6d390de4ec5e6d29715a483f5441] Sender: 0x006d…6ecd, nonce: 473026, gasPrice: 0, gas: 1000000000, value: 0, dataLen: 644))
2019-02-11 10:02:03 UTC  DEBUG txqueue  [0x29774ecf14b97aab20c8850793bb6a950c83ed73296bd8fbe7edb0a8878f65d3] Added to the pool.
2019-02-11 10:02:03 UTC  DEBUG txqueue  [0x29774ecf14b97aab20c8850793bb6a950c83ed73296bd8fbe7edb0a8878f65d3] Sender: 0x006d…6ecd, nonce: 473027, gasPrice: 0, gas: 1000000000, value: 0, dataLen: 644))
2019-02-11 10:02:03 UTC  WARN parity_rpc::v1::impls::eth  Fallback to BlockId::Latest
2019-02-11 10:02:04 UTC  DEBUG txqueue  [0x4e2f47d80c232dfc447e2db5da65ea7dbdffaa3eedb187a7c9114662b3e45344] Added to the pool.
2019-02-11 10:02:04 UTC  DEBUG txqueue  [0x4e2f47d80c232dfc447e2db5da65ea7dbdffaa3eedb187a7c9114662b3e45344] Sender: 0x006d…6ecd, nonce: 473028, gasPrice: 0, gas: 1000000000, value: 0, dataLen: 420))
2019-02-11 10:02:05 UTC IO Worker #0 DEBUG txqueue  Re-computing pending set for block: 24997

Nothing more appears, it just writes it and in 20 sec my client fails with RPC timeout. Then it retries and I get another bunch of logs, then it again writes nothing more, repeat.

I restarted all nodes, I tried to leave single node to prevent desync, I double checked system time. I don't know what can I do.

Here is a code that gets called:

function updateStatus(bytes30 serviceNumber, uint64 responseDate, uint64 statusCode, string memory note) public onlyOwner {
	int index = getRequestIndex(serviceNumber);
	Request storage request = nodes[uint64(index)].request;
	updateStatusInternal(request, responseDate, statusCode, note);
}

function updateStatusInternal(Request storage request, uint64 responseDate, uint64 statusCode, string memory note) private {
	request.statusUpdates.push(Types.StatusUpdate(responseDate, statusCode, note));
	request.statusUpdatesNotes = strConcat(request.statusUpdatesNotes, "\x1f", note);
}

function getRequestIndex(bytes30 serviceNumber) public view returns (int) {
	uint64 index = requestServiceNumberToIndex[serviceNumber];

	if (index == 0 && (nodes.length == 0 || nodes[0].request.serviceNumber != serviceNumber)) {
		return - 1;
	}

	return int(index);
}

It just does a lookup in the map and then insert one item in array. Nothing complicated that has to take hours to complete.


I provide logs and all info I possess. If I can do anything else I'd be rigorous in providing all required additional information.

@jam10o-new
Copy link
Contributor

Hey, didn't see your version number in chat - can you please try with the latest stable? Or if not, what's stopping you from doing so?

@jam10o-new jam10o-new added the Z3-stale 🍃 Issue is in principle valid, but it is not relevant anymore or can not reproduced. label Feb 11, 2019
@Pzixel
Copy link
Contributor Author

Pzixel commented Feb 11, 2019

@joshua-mir no, I can't, becasuse I tried but I had this problem with private keys, as you probably remember.

@jam10o-new
Copy link
Contributor

#10160
Was that your issue with keys?

@Pzixel
Copy link
Contributor Author

Pzixel commented Feb 11, 2019

Looks like it is.

But as we discussed, keys are in proper place:

diff --git a/docker-compose.yml b/docker-compose.yml
index 510821d..e256111 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -13,11 +13,11 @@ services:
       --no-persistent-txqueue
     volumes:
       - ./parity/config:/parity/config:ro
-      - authority0:/root/.local/share/io.parity.ethereum/
-      - ./parity/authorities/authority0.json:/root/.local/share/io.parity.ethereum/keys/${NETWORK_NAME}/authority.json:ro
-      - ./parity/keys:/root/.local/share/io.parity.ethereum/keys/${NETWORK_NAME}
+      - authority0:/home/parity/.local/share/io.parity.ethereum/
+      - ./parity/authorities/authority0.json:/home/parity/.local/share/io.parity.ethereum/keys/${NETWORK_NAME}/authority.json:ro
+      - ./parity/keys:/home/parity/.local/share/io.parity.ethereum/keys/${NETWORK_NAME}
       - ./parity/authorities/authority0.pwd:/parity/authority.pwd:ro
-      - ./parity/node0.network.key:/root/.local/share/io.parity.ethereum/network/key:ro
+      - ./parity/node0.network.key:/home/parity/.local/share/io.parity.ethereum/network/key:ro
     ports:
       - 8545:8545
     networks:

This was your suggested change, but it didn't work.


P.S. running https://github.com/orbita-center/parity-poa-playground/tree/feature/distr (branch feature/distr) in docker may be the fastest way to determine what's wrong.

$ docker-compose up authority0
Recreating parity-poa-playground_authority0_1 ... done
Attaching to parity-poa-playground_authority0_1
authority0_1  | Loading config file from /parity/config/authority.toml
authority0_1  | Option '--dapps-interface' has been removed and is no longer supported.
authority0_1  | Option '--ui-interface' has been removed and is no longer supported.
authority0_1  | 2019-02-11 12:23:40 UTC Starting Parity-Ethereum/v2.2.8-stable-ec81d67-20190130/x86_64-linux-gnu/rustc1.31.1
authority0_1  | 2019-02-11 12:23:40 UTC Keys path /home/parity/.local/share/io.parity.ethereum/keys/parity-poa-playground
authority0_1  | 2019-02-11 12:23:40 UTC DB path /home/parity/.local/share/io.parity.ethereum/chains/parity-poa-playground/db/d24f9ddc2dd71e78
authority0_1  | 2019-02-11 12:23:40 UTC State DB configuration: fast
authority0_1  | 2019-02-11 12:23:40 UTC Operating mode: active
authority0_1  | Consensus signer account not found for the current chain. You can create an account via RPC, UI or `parity account new --chain /parity/config/chain.json --keys-path /home/parity/.local/share/io.parity.ethereum/keys`.
parity-poa-playground_authority0_1 exited with code 1

@jam10o-new
Copy link
Contributor

I don't know how we can address the issue (if it isn't already) if you can't update to a version where the fix has been applied though 😥

@Pzixel
Copy link
Contributor Author

Pzixel commented Feb 11, 2019

I'm not sure why you think it was fixed in last months, since I'm working with parity for almost a year and it never works that way, it's always up to some weird parameters and startup arguments.

I'm open to the dialogue, I can fix the issue and try to migrate to newer version if you tell me what to do. I tried my best but it didn't work

When I fix paths /root/.local/share -> /home/parity/.local/share I get

Alex@DESKTOP-CA0EFN5 MINGW64 ~/Documents/Repo/parity-poa-playground (feature/distr)
$ docker-compose up authority0
Recreating parity-poa-playground_authority0_1 ... done
Attaching to parity-poa-playground_authority0_1
authority0_1  | Loading config file from /parity/config/authority.toml
authority0_1  | Error upgrading parity data: CannotWriteVersionFile
parity-poa-playground_authority0_1 exited with code 1

Looks like #9791 (comment) but I don't see how to solve it

I tried

volumes:
  authority0:
    driver_opts: 
      o: uid=1000,gid=1000

But I get

Creating parity-poa-playground_authority0_1 ... error

ERROR: for parity-poa-playground_authority0_1  Cannot start service authority0: error while mounting volume '/var/lib/docker/volumes/parity-poa-playground_authority0/_data': missing device in volume options

ERROR: for authority0  Cannot start service authority0: error while mounting volume '/var/lib/docker/volumes/parity-poa-playground_authority0/_data': missing device in volume options
Encountered errors while bringing up the project.

@Pzixel
Copy link
Contributor Author

Pzixel commented Feb 11, 2019

I talked a bit with my crown, they say it matters that permissions I have outside container on my host machine. Sounds weird, but they are quite convincing

@jam10o-new
Copy link
Contributor

At least for the image we provide, the directories and files outside need to have permissions set so that the user within the container can access them.

I don't know how that's related to your issue though, especially since that isn't the case as far as I'm aware with your container.

@Pzixel
Copy link
Contributor Author

Pzixel commented Feb 11, 2019

I don't know how that's related to your issue though, especially since that isn't the case as far as I'm aware with your container.

I'm trying to migrate to latest parity to confirm it works (i.e. it doesn't) the same way on 2.2.8, because as you said "We won't learn this issue until you upgrade".

At least for the image we provide, the directories and files outside need to have permissions set so that the user within the container can access them.

In my environment I'm running docker on Windows so I don't have any manageable permissions on my host machine.


I'l probably try to workaround these changes by forcing root running. But I hope there could be something other.

Maybe I've spammed a lot and should create another issue when I'm done with upgrade? I just want this problem to be solved, but I don't see how can I do it.

@Pzixel
Copy link
Contributor Author

Pzixel commented Feb 12, 2019

FIxed migration with following PR: https://github.com/orbita-center/parity-poa-playground/pull/19

In a nutshell: I created a new image on top of yours and chown all folders instead of mounting them.

I'm going to create a new issue instead of this one because it got spammed a bit.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Z3-stale 🍃 Issue is in principle valid, but it is not relevant anymore or can not reproduced.
Projects
None yet
Development

No branches or pull requests

3 participants