Skip to content
This repository has been archived by the owner on May 18, 2020. It is now read-only.

If failsafe JVM crashes, ES remains running #44

Open
cstamas opened this issue Jan 11, 2018 · 4 comments
Open

If failsafe JVM crashes, ES remains running #44

cstamas opened this issue Jan 11, 2018 · 4 comments

Comments

@cstamas
Copy link

cstamas commented Jan 11, 2018

In case when failsafe crashes (JVM forcefully exits), the ES child process remains running and nothing cleans it up. This is problematic, as in case of CI build, it would occupy resources, even if ports are randomised.

@gaczm
Copy link
Contributor

gaczm commented Feb 19, 2018

Hi, I think that in case of kill -9 we won't be able to do anything with that process anyway... Do you have some propositions what to do in such situation?

@cstamas
Copy link
Author

cstamas commented Feb 20, 2018

What we do in our ITs: the spawned wrapper process and the spawning process "keep in touch" via Pings sent over TCP port. Basically, if "pong" never comes back, or port is dead, you can be sure (the spawner) that spawned process is dead. Similarly, the STOP is also implemented in such manner, is sent by spawner to spawned process, that performs then a clean shut down of whatever server/app it is wrapping.

Similar logic may be used in spawned process: if nobody sends ping, there is noone to send pong responses, then spawner died off, and the process itself should go away.

These are most interesting on CI uses, where accumulated dangling processes may suffocate the machine by doing nothing (just remain active, as nothing shuts them down).

@gaczm
Copy link
Contributor

gaczm commented Feb 21, 2018

Ok, but two questions:

  1. why your jvm is killed in this way (I assume kill -9, otherwise ES will be killed by shutdown hook)?
  2. if your jvm can be killed with kill -9, then what happens when wrapper process is killed in same way?

@msavy
Copy link

msavy commented May 25, 2018

This is a problem in IDEs in debug mode, as well -- usually when you hit terminate/stop (e.g. Eclipse). Many of them don't execute the JVM shutdown hooks, leaving spawned processes hanging.

One possible solution I've thought of is that Embedded ES could provide an accessor for the PID in pl.allegro.tech.embeddedelasticsearch.ElasticServer. Users could then write the PID to file and flush them when the program cleanly exits. If there's an unclean exit then the PIDs would still be hanging around and we could manually kill them out at next startup.

I've used this technique for some test-related stuff I'm working on (using reflection to get the PID). It's not beautiful, but it is very simple and works effectively.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants