Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ES daemon-mode in process launcher #701

Merged
merged 29 commits into from
Jun 20, 2019

Conversation

drawlerr
Copy link
Contributor

@drawlerr drawlerr commented Jun 3, 2019

Launch Elasticsearch as a daemon.

Relates to #697
Closes #718

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I left some high-level comments that I think help simplifying the overall approach.

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
tests/mechanic/launcher_test.py Outdated Show resolved Hide resolved
tests/mechanic/launcher_test.py Outdated Show resolved Hide resolved
return p


class ProcessLauncherTests(TestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think testing this gets much simpler once we only rely on the PID. Then we can just mock process startup and mock the check for a PID file. Once we expose a provision and launch subcommand this should also become way simpler to add this as an integration test in integration-test.sh. I'd imagine an integration test along those lines:

PROVISIONER_ID=$(esrally provision --distribution-version=6.8.0 --car=4gheap)
esrally start --provisioner-id=${PROVISIONER_ID}
# maybe do a check here that the node is actually reachable via HTTP
esrally stop --provisioner-id=${PROVISIONER_ID}

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! I left a few comments but the overall direction looks good to me.

stdout=subprocess.PIPE, stderr=subprocess.STDOUT, stdin=subprocess.DEVNULL), node_name)
def _start_process(self, cmd):
subprocess.Popen(shlex.split(cmd), close_fds=True)
#TODO: wait_for_docker_pidfile?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dliappis can we assume that if docker-compose returns (with exit code 0), Elasticsearch is about to startup? (i.e. no need to wait for the PID file)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid it's a "it depends" answer :)

With docker-compose it's possible to have healthchecks and depending on how they are defined we can safely assume ES is actually listening; this is done for example in

healthcheck:
test: curl -f http://localhost:9200
interval: 5s
timeout: 2s
retries: 10
and the other service, which can be a sidecar, depends on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dliappis I think this is entirely in our control as the only way to start the Docker image is via https://github.com/elastic/rally/blob/master/esrally/resources/docker-compose.yml - So we cannot assume it is already listening (which is ok) but I wonder whether we can assume that the Elasticsearch process has been started already?

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments around _start_process. Also, we usually try to avoid force-pushing during the review process.

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
esrally/mechanic/launcher.py Show resolved Hide resolved
esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments on making the Docker Launcher more robust.

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! I left some comments, mainly to discuss whether we can rid our selves from the Docker library dependency.

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
esrally/mechanic/launcher.py Show resolved Hide resolved
esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
@drawlerr drawlerr added :Benchmark Candidate Management Anything affecting how Rally sets up Elasticsearch enhancement Improves the status quo labels Jun 18, 2019
@drawlerr drawlerr requested a review from dliappis June 18, 2019 20:23
Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! I spent some time also testing various scenarios and this is looking good.

I left a comment for a bug, I think, that prevents DockerLauncher.stop() from working correctly.

esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
esrally/mechanic/launcher.py Outdated Show resolved Hide resolved
@dliappis dliappis added this to the 1.3.0 milestone Jun 20, 2019
@drawlerr drawlerr changed the title WIP - Implement ES daemon-mode in process launcher Implement ES daemon-mode in process launcher Jun 20, 2019
@drawlerr drawlerr dismissed danielmitterdorfer’s stale review June 20, 2019 16:23

All comments addressed, but Daniel is unavailable to approve.

@drawlerr drawlerr merged commit 7f6db41 into elastic:master Jun 20, 2019
@drawlerr drawlerr deleted the es-as-daemon branch June 20, 2019 16:26
dliappis added a commit that referenced this pull request Jun 21, 2019
@dliappis
Copy link
Contributor

dliappis commented Jun 21, 2019

This needs to be re-opened (or a new one opened instead) as nightlies fail with:

esrally.exceptions.RallyError: (Daemon startup failed with exit code[1], 'Traceback (most recent call last):\n  File "/var/lib/jenkins/src/rally/esrally/mechanic/mechanic.py", line 576, in receiveMsg_StartNodes\n    nodes = self.mechanic.start_engine()\n  File "/var/lib
/jenkins/src/rally/esrally/mechanic/mechanic.py", line 712, in start_engine\n    self.nodes = self.launcher.start(node_configs)\n  File "/var/lib/jenkins/src/rally/esrally/mechanic/launcher.py", line 282, in start\n    return [self._start_node(node_configuration, node_c
ount_on_host) for node_configuration in node_configurations]\n  File "/var/lib/jenkins/src/rally/esrally/mechanic/launcher.py", line 282, in <listcomp>\n    return [self._start_node(node_configuration, node_count_on_host) for node_configuration in node_configurations]\n
  File "/var/lib/jenkins/src/rally/esrally/mechanic/launcher.py", line 311, in _start_node\n    node_pid = self._start_process(binary_path, env)\n  File "/var/lib/jenkins/src/rally/esrally/mechanic/launcher.py", line 361, in _start_process\n    raise exceptions.LaunchEr
ror(msg)\nesrally.exceptions.LaunchError: (\'Daemon startup failed with exit code[1]\', None)\n')

drawlerr pushed a commit to drawlerr/rally that referenced this pull request Jun 27, 2019
@drawlerr drawlerr removed the WiP label Jun 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Benchmark Candidate Management Anything affecting how Rally sets up Elasticsearch enhancement Improves the status quo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sysstats.py crashed if no valid brand string in cpuinfo on an eMAG aarch64 system
3 participants