Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JsonRPCHttpService can silently fail to start #662

Closed
magooster opened this issue Apr 6, 2020 · 4 comments · Fixed by #714
Closed

JsonRPCHttpService can silently fail to start #662

magooster opened this issue Apr 6, 2020 · 4 comments · Fixed by #714
Labels
bug Something isn't working

Comments

@magooster
Copy link
Contributor

Steps to Reproduce (Bug)

  1. Configure node for TLS --rpc-http-tls-enabled, --rpc-http-tls-keystore-file=, --rpc-http-tls-keystore-password-file= make the password file path invalid
  2. Start the node
  3. Try to access JSON_RPC Http endpoint

Expected behavior:
Exception should be thrown and node fails to start

Actual behavior:
Node starts with no JSON-RPC Http service

Frequency:
All the time

Versions (Add all that apply)

  • Software version: 1.4.2

Additional Information

As you can see from the logs, never get a log confirming that JsonRPC service has started...

2020-04-06 23:19:03.213+01:00 | main | INFO | JsonRpcHttpService | Starting JsonRPC service on 0.0.0.0:8545
2020-04-06 23:19:03.216+01:00 | main | INFO | WebSocketService | Starting Websocket service on 0.0.0.0:8546
2020-04-06 23:19:03.244+01:00 | vert.x-eventloop-thread-1 | INFO | WebSocketService | Websocket service started and listening on 0.0.0.0:8546
2020-04-06 23:19:03.244+01:00 | main | INFO | MetricsHttpService | Starting metrics http service on 0.0.0.0:9545
2020-04-06 23:19:03.255+01:00 | main | INFO | Runner | Ethereum main loop is up.
2020-04-06 23:19:03.256+01:00 | vert.x-eventloop-thread-2 | INFO | MetricsHttpService | Metrics service started and listening on 9545:9545

netstat -nltp shows nothing listening on 8545

Looking at waitForServiceToStart in org.hyperledger.besu.Runner.java, it would seem only checking the status of the future before it is completed, any future that completes immediately is not checked.

private void waitForServiceToStart(
final String serviceName, final CompletableFuture<?> startFuture) {
while (!startFuture.isDone()) {
try {
startFuture.get(60, TimeUnit.SECONDS);
} catch (final InterruptedException e) {
Thread.currentThread().interrupt();
throw new IllegalStateException("Interrupted while waiting for service to start", e);
} catch (final ExecutionException e) {
LOG.error("Service " + serviceName + " failed to start", e);
throw new IllegalStateException(e);
} catch (final TimeoutException e) {
LOG.warn("Service {} is taking an unusually long time to start", serviceName);
}
}
}

Why not just just call Future.get(60, TimeUnit.SECONDS) without the while loop?

@shemnon
Copy link
Contributor

shemnon commented Apr 6, 2020

We in essence poll forever. Every 60 seconds we complain that it is taking too long. The log messages you have show that it isn't complete. What does the log look like after 60 seconds? Do we get an error at error level or a complaint about taking an unusually long time to start?

The logs will be at ERROR or WARN, which should show up in standard output unless you are doing something special to the log4j config.

@magooster
Copy link
Contributor Author

IMHO: You'll never get an error (warning), as the future has already completed... Future.isDone Returns true if completed in any fashion: normally, exceptionally, or via cancellation.

I can wait forever... I never see the warning in the logs.

With no issues JSON-RPC service starts almost immediately...

2020-04-07 00:59:14.349+01:00 | main | INFO | JsonRpcHttpService | Starting JsonRPC service on 0.0.0.0:8545
2020-04-07 00:59:14.429+01:00 | vert.x-eventloop-thread-1 | INFO | JsonRpcHttpService | JsonRPC service started and listening on 0.0.0.0:8545 with TLS enabled.

@MadelineMurray MadelineMurray added the bug Something isn't working label Apr 7, 2020
shemnon added a commit to shemnon/besu that referenced this issue Apr 15, 2020
It is possible for the service furture to fail exceptionally before we
first poll for waiting, so we should switch to a do/while loop so we
will get the exception re-thrown instead of ignoring it.

This may fix hyperledger#662

Signed-off-by: Danno Ferrin <[email protected]>
@shemnon
Copy link
Contributor

shemnon commented Apr 15, 2020

I think we need to go from a while to a do/while then, to give the exception a chance to throw.

AbdelStark pushed a commit that referenced this issue Apr 15, 2020
It is possible for the service furture to fail exceptionally before we
first poll for waiting, so we should switch to a do/while loop so we
will get the exception re-thrown instead of ignoring it.

This may fix #662

Signed-off-by: Danno Ferrin <[email protected]>
@MadelineMurray
Copy link
Contributor

@magooster - #714 will be included in 1.4.4 and should solve this issue. If not, let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants