If failsafe JVM crashes, ES remains running #44

cstamas · 2018-01-11T16:50:59Z

In case when failsafe crashes (JVM forcefully exits), the ES child process remains running and nothing cleans it up. This is problematic, as in case of CI build, it would occupy resources, even if ports are randomised.

gaczm · 2018-02-19T21:44:01Z

Hi, I think that in case of kill -9 we won't be able to do anything with that process anyway... Do you have some propositions what to do in such situation?

cstamas · 2018-02-20T12:57:05Z

What we do in our ITs: the spawned wrapper process and the spawning process "keep in touch" via Pings sent over TCP port. Basically, if "pong" never comes back, or port is dead, you can be sure (the spawner) that spawned process is dead. Similarly, the STOP is also implemented in such manner, is sent by spawner to spawned process, that performs then a clean shut down of whatever server/app it is wrapping.

Similar logic may be used in spawned process: if nobody sends ping, there is noone to send pong responses, then spawner died off, and the process itself should go away.

These are most interesting on CI uses, where accumulated dangling processes may suffocate the machine by doing nothing (just remain active, as nothing shuts them down).

gaczm · 2018-02-21T19:44:55Z

Ok, but two questions:

why your jvm is killed in this way (I assume kill -9, otherwise ES will be killed by shutdown hook)?
if your jvm can be killed with kill -9, then what happens when wrapper process is killed in same way?

msavy · 2018-05-25T11:11:08Z

This is a problem in IDEs in debug mode, as well -- usually when you hit terminate/stop (e.g. Eclipse). Many of them don't execute the JVM shutdown hooks, leaving spawned processes hanging.

One possible solution I've thought of is that Embedded ES could provide an accessor for the PID in pl.allegro.tech.embeddedelasticsearch.ElasticServer. Users could then write the PID to file and flush them when the program cleanly exits. If there's an unclean exit then the PIDs would still be hanging around and we could manually kill them out at next startup.

I've used this technique for some test-related stuff I'm working on (using reflection to get the PID). It's not beautiful, but it is very simple and works effectively.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If failsafe JVM crashes, ES remains running #44

If failsafe JVM crashes, ES remains running #44

cstamas commented Jan 11, 2018

gaczm commented Feb 19, 2018

cstamas commented Feb 20, 2018 •

edited

Loading

gaczm commented Feb 21, 2018

msavy commented May 25, 2018 •

edited

Loading

If failsafe JVM crashes, ES remains running #44

If failsafe JVM crashes, ES remains running #44

Comments

cstamas commented Jan 11, 2018

gaczm commented Feb 19, 2018

cstamas commented Feb 20, 2018 • edited Loading

gaczm commented Feb 21, 2018

msavy commented May 25, 2018 • edited Loading

cstamas commented Feb 20, 2018 •

edited

Loading

msavy commented May 25, 2018 •

edited

Loading