Implement ES daemon-mode in process launcher #701

drawlerr · 2019-06-03T14:09:32Z

Launch Elasticsearch as a daemon.

Relates to #697
Closes #718

danielmitterdorfer

Thanks for the PR! I left some high-level comments that I think help simplifying the overall approach.

esrally/mechanic/launcher.py

tests/mechanic/launcher_test.py

danielmitterdorfer · 2019-06-04T09:46:31Z

tests/mechanic/launcher_test.py

+    return p
+
+
+class ProcessLauncherTests(TestCase):


I think testing this gets much simpler once we only rely on the PID. Then we can just mock process startup and mock the check for a PID file. Once we expose a provision and launch subcommand this should also become way simpler to add this as an integration test in integration-test.sh. I'd imagine an integration test along those lines:

PROVISIONER_ID=$(esrally provision --distribution-version=6.8.0 --car=4gheap) esrally start --provisioner-id=${PROVISIONER_ID} # maybe do a check here that the node is actually reachable via HTTP esrally stop --provisioner-id=${PROVISIONER_ID}

danielmitterdorfer

Thanks for iterating! I left a few comments but the overall direction looks good to me.

danielmitterdorfer · 2019-06-07T10:20:30Z

esrally/mechanic/launcher.py

-                                       stdout=subprocess.PIPE, stderr=subprocess.STDOUT, stdin=subprocess.DEVNULL), node_name)
+    def _start_process(self, cmd):
+        subprocess.Popen(shlex.split(cmd), close_fds=True)
+        #TODO:  wait_for_docker_pidfile?


@dliappis can we assume that if docker-compose returns (with exit code 0), Elasticsearch is about to startup? (i.e. no need to wait for the PID file)?

I am afraid it's a "it depends" answer :)

With docker-compose it's possible to have healthchecks and depending on how they are defined we can safely assume ES is actually listening; this is done for example in

rally/docker/docker-compose-tests.yml

Lines 26 to 30 in 69cb36d

healthcheck:

test: curl -f http://localhost:9200

interval: 5s

timeout: 2s

retries: 10

and the other service, which can be a sidecar, depends on it.

@dliappis I think this is entirely in our control as the only way to start the Docker image is via https://github.com/elastic/rally/blob/master/esrally/resources/docker-compose.yml - So we cannot assume it is already listening (which is ok) but I wonder whether we can assume that the Elasticsearch process has been started already?

esrally/mechanic/launcher.py

setup.py

danielmitterdorfer

I left a few comments around _start_process. Also, we usually try to avoid force-pushing during the review process.

esrally/mechanic/launcher.py

dliappis

I left some comments on making the Docker Launcher more robust.

esrally/mechanic/launcher.py

esrally/utils/sysstats.py

dliappis

Thanks for iterating! I left some comments, mainly to discuss whether we can rid our selves from the Docker library dependency.

esrally/mechanic/launcher.py

dliappis

Thanks for iterating! I spent some time also testing various scenarios and this is looking good.

I left a comment for a bug, I think, that prevents DockerLauncher.stop() from working correctly.

esrally/mechanic/launcher.py

…n mock.

All comments addressed, but Daniel is unavailable to approve.

This reverts commit 7f6db41.

dliappis · 2019-06-21T06:35:52Z

This needs to be re-opened (or a new one opened instead) as nightlies fail with:

esrally.exceptions.RallyError: (Daemon startup failed with exit code[1], 'Traceback (most recent call last):\n  File "/var/lib/jenkins/src/rally/esrally/mechanic/mechanic.py", line 576, in receiveMsg_StartNodes\n    nodes = self.mechanic.start_engine()\n  File "/var/lib
/jenkins/src/rally/esrally/mechanic/mechanic.py", line 712, in start_engine\n    self.nodes = self.launcher.start(node_configs)\n  File "/var/lib/jenkins/src/rally/esrally/mechanic/launcher.py", line 282, in start\n    return [self._start_node(node_configuration, node_c
ount_on_host) for node_configuration in node_configurations]\n  File "/var/lib/jenkins/src/rally/esrally/mechanic/launcher.py", line 282, in <listcomp>\n    return [self._start_node(node_configuration, node_count_on_host) for node_configuration in node_configurations]\n
  File "/var/lib/jenkins/src/rally/esrally/mechanic/launcher.py", line 311, in _start_node\n    node_pid = self._start_process(binary_path, env)\n  File "/var/lib/jenkins/src/rally/esrally/mechanic/launcher.py", line 361, in _start_process\n    raise exceptions.LaunchEr
ror(msg)\nesrally.exceptions.LaunchError: (\'Daemon startup failed with exit code[1]\', None)\n')

…#701)"" This reverts commit a6ed49d.

drawlerr added the WiP label Jun 3, 2019

drawlerr requested review from danielmitterdorfer, dliappis and ebadyano June 3, 2019 14:09

danielmitterdorfer reviewed Jun 4, 2019

View reviewed changes

drawlerr force-pushed the es-as-daemon branch from 72443d9 to c9f6344 Compare June 6, 2019 18:44

danielmitterdorfer reviewed Jun 7, 2019

View reviewed changes

danielmitterdorfer mentioned this pull request Jun 7, 2019

Allow to manage Elasticsearch nodes separately from benchmarking #697

Closed

7 tasks

Dennis Lawler added 5 commits June 9, 2019 11:23

WIP - Implement ES daemon-mode in process launcher

a1d139a

Minor cleanup and code review fixes.

d924f65

Replace logfile watcher with pidfile watcher

5b6ba4c

Remove old StartupWatcher / subproc path

545fd42

Drop unneeded pyyaml dependency.

16f8555

drawlerr force-pushed the es-as-daemon branch from c9f6344 to 16f8555 Compare June 10, 2019 13:59

Dennis Lawler added 2 commits June 10, 2019 09:02

Bump up pidfile wait to 60s

3bc1400

Remove unused _get_pid_from_file, wait_for_pidfile call

05b761e

danielmitterdorfer requested changes Jun 11, 2019

View reviewed changes

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved

esrally/mechanic/launcher.py Show resolved Hide resolved

Dennis Lawler added 3 commits June 11, 2019 07:19

Fix _start_process call.

53ab793

Add env back in for now.

3febdd8

Add ,

e2e936c

danielmitterdorfer previously requested changes Jun 11, 2019

View reviewed changes

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved

Use keyword-arg for env.

e6a90a1

dliappis requested changes Jun 12, 2019

View reviewed changes

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved

Dennis Lawler added 4 commits June 12, 2019 08:38

Mock out process launcher unit test.

1d71917

Fix sysstat telemetry in mocked tests.

95f7127

Merge branch 'master' into es-as-daemon

874dc16

Remove unused 'daemon' option.

d8d5b7d

danielmitterdorfer reviewed Jun 13, 2019

View reviewed changes

esrally/utils/sysstats.py Outdated Show resolved Hide resolved

Dennis Lawler added 2 commits June 13, 2019 09:04

Remove duplicate log line.

72f30a0

Re-arrange conditions for cpu_model.

14dc6bd

Dennis Lawler added 4 commits June 13, 2019 17:16

Add .j2 to docker-compose.yml, add healthcheck.

69cd57f

Remove unused _parse_log_ts.

f08fc00

Get container ID, wait for it to show up in docker container list.

0d98e4a

Implement docker wait.

8a25645

drawlerr requested review from dliappis and danielmitterdorfer June 18, 2019 02:46

dliappis reviewed Jun 18, 2019

View reviewed changes

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved

esrally/mechanic/launcher.py Show resolved Hide resolved

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved

drawlerr added :Benchmark Candidate Management Anything affecting how Rally sets up Elasticsearch enhancement Improves the status quo labels Jun 18, 2019

Dennis Lawler added 2 commits June 18, 2019 13:19

Use docker cli instead of SDK, use LaunchError instead of TimeoutError.

1e4561a

Change stop() to use _get_docker_compose_cmd.

eb91a37

drawlerr requested a review from dliappis June 18, 2019 20:23

Change TimeoutError to LaunchError.

58c9fcd

dliappis requested changes Jun 20, 2019

View reviewed changes

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved

Dennis Lawler added 2 commits June 20, 2019 08:36

Remove self.node_name, replace with node.node_name.

d500662

Re-order time import.

6b84bf8

dliappis added this to the 1.3.0 milestone Jun 20, 2019

dliappis approved these changes Jun 20, 2019

View reviewed changes

drawlerr changed the title ~~WIP - Implement ES daemon-mode in process launcher~~ Implement ES daemon-mode in process launcher Jun 20, 2019

Dennis Lawler added 3 commits June 20, 2019 09:13

Comment to explain discrepancy between returncode and wait() return i…

6c803ca

…n mock.

Remove unused unit testing helper functions.

d0457b8

Remove some more unused leftovers.

f03e8f4

drawlerr merged commit 7f6db41 into elastic:master Jun 20, 2019

drawlerr deleted the es-as-daemon branch June 20, 2019 16:26

dliappis added a commit that referenced this pull request Jun 21, 2019

Revert "Implement ES daemon-mode in process launcher (#701)"

a6ed49d

This reverts commit 7f6db41.

drawlerr pushed a commit to drawlerr/rally that referenced this pull request Jun 27, 2019

Revert "Revert "Implement ES daemon-mode in process launcher (elastic…

79d7653

…#701)"" This reverts commit a6ed49d.

drawlerr removed the WiP label Jun 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement ES daemon-mode in process launcher #701

Implement ES daemon-mode in process launcher #701

drawlerr commented Jun 3, 2019 •

edited

Loading

danielmitterdorfer left a comment

danielmitterdorfer Jun 4, 2019

danielmitterdorfer left a comment

danielmitterdorfer Jun 7, 2019

dliappis Jun 7, 2019

danielmitterdorfer Jun 11, 2019

danielmitterdorfer left a comment

dliappis left a comment

dliappis left a comment

dliappis left a comment

dliappis commented Jun 21, 2019 •

edited

Loading

	healthcheck:
	test: curl -f http://localhost:9200
	interval: 5s
	timeout: 2s
	retries: 10

Implement ES daemon-mode in process launcher #701

Implement ES daemon-mode in process launcher #701

Conversation

drawlerr commented Jun 3, 2019 • edited Loading

danielmitterdorfer left a comment

Choose a reason for hiding this comment

danielmitterdorfer Jun 4, 2019

Choose a reason for hiding this comment

danielmitterdorfer left a comment

Choose a reason for hiding this comment

danielmitterdorfer Jun 7, 2019

Choose a reason for hiding this comment

dliappis Jun 7, 2019

Choose a reason for hiding this comment

danielmitterdorfer Jun 11, 2019

Choose a reason for hiding this comment

danielmitterdorfer left a comment

Choose a reason for hiding this comment

dliappis left a comment

Choose a reason for hiding this comment

dliappis left a comment

Choose a reason for hiding this comment

dliappis left a comment

Choose a reason for hiding this comment

dliappis commented Jun 21, 2019 • edited Loading

drawlerr commented Jun 3, 2019 •

edited

Loading

dliappis commented Jun 21, 2019 •

edited

Loading