Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce rally-tracks compatibility testing #1564

Merged
merged 12 commits into from
Aug 30, 2022

Conversation

michaelbaamonde
Copy link
Contributor

@michaelbaamonde michaelbaamonde commented Aug 22, 2022

This PR adds the infrastructure necessary to run rally-tracks integration tests (see elastic/rally-tracks#289) from within the Rally repository, both locally and--more importantly--as a PR check via CI.

We modify and extend the pytest-rally plugin so that its behavior and defaults make sense when run from within the Rally repo as opposed to a track repo. This enables Rally developers and CI jobs to test Rally changes against arbitrary revisions of local track repositories, using arbitrary versions of Elasticsearch.

In the default case, it simply ensures that changes being made to Rally do not break the master branch of rally-tracks. It does this by executing whatever tests are contained in the it subdirectory of ~/.rally/benchmarks/tracks/default. By default, it uses a build of the main branch of ES from source (i.e. --revision=current in Rally terms) as the benchmark candidate.

To run these tests, execute pytest it/track_repo_compatibility/ from the root of the Rally repo after running make install, or run make rally-tracks-compat to run them within a tox environment.

Default behavior

Here's an example of what happens by default, but we'll limit the example to just one test (for brevity of output) and add some extra logging to more clearly see what the pytest-rally plugin is doing:

pytest it/track_repo_compatibility --log-cli-level=INFO -k metricbeat

===================================================================================== test session starts ======================================================================================
platform linux -- Python 3.8.13, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /home/baamonde/code/elastic/rally/.venv/bin/python3
cachedir: .pytest_cache
benchmark: 3.2.2 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rally: track-repository=/home/baamonde/.rally/benchmarks/tracks/default, track-revision=master
rootdir: /home/baamonde/code/elastic/rally, configfile: pyproject.toml
plugins: benchmark-3.2.2, rally-0.0.1, asyncio-0.18.1, anyio-3.6.1, httpserver-1.0.4
asyncio: mode=strict
collecting ...
------------------------------------------------------------------------------------- live log collection --------------------------------------------------------------------------------------
INFO     pytest_rally.rally:rally.py:110 Running command: [esrally list tracks --track-repository="/home/baamonde/.rally/benchmarks/tracks/default" --track-revision="master" --configuration-name="pytest"]
collected 68 items / 67 deselected / 1 selected

test_all_tracks_and_challenges.py::TestTrackRepository::test_autogenerated[metricbeat-append-no-conflicts]
---------------------------------------------------------------------------------------- live log setup ----------------------------------------------------------------------------------------
INFO     pytest_rally.elasticsearch:elasticsearch.py:84 Installing Elasticsearch: [esrally install --quiet --http-port=19200 --node=rally-node --master-nodes=rally-node --car=4gheap,trial-license,x-pack-ml --seed-hosts="127.0.0.1:19300" --revision=current]
INFO     pytest_rally.elasticsearch:elasticsearch.py:93 Starting Elasticsearch: [esrally start --runtime-jdk=bundled --installation-id=a14708f6-d2c0-49e1-aaa2-dd60a2acaf9d --race-id=b50f7204-3f70-4724-bb91-3e0050d0e2e0]
---------------------------------------------------------------------------------------- live log call -----------------------------------------------------------------------------------------
INFO     pytest_rally.rally:rally.py:144 Running command: [esrally race --track="metricbeat" --challenge="append-no-conflicts" --track-repository="/home/baamonde/.rally/benchmarks/tracks/default" --track-revision="master" --configuration-name="pytest" --enable-assertions --kill-running-processes --on-error="abort" --pipeline="benchmark-only" --target-hosts="127.0.0.1:19200" --test-mode]
PASSED                                                                                                                                                                                   [100%]
-------------------------------------------------------------------------------------- live log teardown ---------------------------------------------------------------------------------------
INFO     pytest_rally.rally:rally.py:91 Removing Rally config from [/home/baamonde/.rally/rally-pytest.ini]
INFO     pytest_rally.elasticsearch:elasticsearch.py:104 Stopping Elasticsearch: [esrally stop --installation-id=a14708f6-d2c0-49e1-aaa2-dd60a2acaf9d]


============================================================================== 1 passed, 67 deselected in 33.14s ===============================================================================

You can see that we've applied the defaults for the --track-repository and --track-revision options in the first few lines of the output:

rally: track-repository=/home/baamonde/.rally/benchmarks/tracks/default, track-revision=master

And that these make their way into the esrally commands that pytest-rally generates:

INFO     pytest_rally.rally:rally.py:144 Running command: [esrally race --track="metricbeat" --challenge="append-no-conflicts" --track-repository="/home/baamonde/.rally/benchmarks/tracks/default" --track-revision="master" --configuration-name="pytest" --enable-assertions --kill-running-processes --on-error="abort" --pipeline="benchmark-only" --target-hosts="127.0.0.1:19200" --test-mode]`

Overriding defaults

For local development, overriding these defaults could be useful. For example, if you're working on changes to both Rally and rally-tracks locally, and want to test with a released version of ES, you could do something like this:

pytest it/track_repo_compatibility/ \
    --log-cli-level=INFO \
    --track-repository=/home/baamonde/code/elastic/rally-tracks \
    --track-revision=ci \
    --distribution-version=8.3.2 \
    -k metricbeat

The pytest session header will reflect these changes:

rally: track-repository=/home/baamonde/code/elastic/rally-tracks, track-revision=ci

The plugin will pass along the --distribution-version to install the ES fixture:

INFO     pytest_rally.elasticsearch:elasticsearch.py:84 Installing Elasticsearch: [esrally install --quiet --http-port=19200 --node=rally-node --master-nodes=rally-node --car=4gheap,trial-license,x-pack-ml --seed-hosts="127.0.0.1:19300" --distribution-version=8.3.2]

And race commands will provide the correct --track-repository and --track-revision:

INFO     pytest_rally.rally:rally.py:144 Running command: [esrally race --track="metricbeat" --challenge="append-no-conflicts" --track-repository="/home/baamonde/code/elastic/rally-tracks" --track-revision="ci" --configuration-name="pytest" --enable-assertions --kill-running-processes --on-error="abort" --pipeline="benchmark-only" --target-hosts="127.0.0.1:19200" --test-mode]

Mike Baamonde added 7 commits August 22, 2022 16:43
This commit modifies and extends `pytest-rally` so that it can be invoked from
within the Rally repo. This enables Rally developers and CI jobs to test
Rally changes against arbitrary revisions of local track repositories,
using arbitrary versions of Elasticsearch.

The actual contents of the tests live in the track repository. The
plugin will run any tests found in the `it` subdirectory of the provided track
repository by default, including those that are auto-generated by
`pytest-rally`.

By default, tests will be run against the `master` branch of the track
repository checked out in `$RALLY_HOME/benchmarks/tracks/default` (typically
`rally-tracks`), using a build of the `main` branch of Elasticsearch.

To run tests with these defaults, run the following from the root of this
repository:

`pytest it/track_repo_compatibility`

Here is an example invocation that overrides these defaults:

```
pytest it/track_repo_compatibility \
     --track-repository=/path/to/repo \
     --track-revision=some-branch \
     --distribution-version=8.3.2
```
@michaelbaamonde michaelbaamonde marked this pull request as ready for review August 23, 2022 18:03
@@ -67,6 +67,8 @@ install-user: venv-create
install: install-user
# Also install development dependencies
. $(VENV_ACTIVATE_FILE); $(PIP_WRAPPER) install -e .[develop]
. $(VENV_ACTIVATE_FILE); $(PIP_WRAPPER) install git+https://github.com/elastic/pytest-rally.git
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a stop-gap until we cut a release of pytest-rally. I cannot manage to get pip and/or hatchling to behave if I attempt to install this from source as an optional develop dependency declared in pyproject.toml, even if tool.hatch.metadata.allow-direct-references is set to true. It's a rabbit hole for another day.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bug in pip. In any case, I think we should upload pytest-rally on PyPI since it's not expected to change much and the resulting experience would be nicer.

Copy link
Contributor Author

@michaelbaamonde michaelbaamonde Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I looked into it a bit and it does seem to be a bug in pip. Definitely agree on uploading the plugin to PyPI once it's stable.

@@ -47,3 +48,8 @@ commands =

whitelist_externals =
pytest

[testenv:rally-tracks-compat]
deps = pytest-rally @ git+https://github.com/elastic/pytest-rally.git
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default=RALLY_TRACKS_DIR,
help=("Path to a local track repository\n" f"(default: {RALLY_TRACKS_DIR})"),
)
group.addoption("--track-revision", action="store", default="master", help=("Track repository revision to test\n" "default: `master`"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding that comma and running black will make this call more consistent with the others and thus easier to read:

Suggested change
group.addoption("--track-revision", action="store", default="master", help=("Track repository revision to test\n" "default: `master`"))
group.addoption("--track-revision", action="store", default="master", help=("Track repository revision to test\n" "default: `master`"),)

if repo == RALLY_TRACKS_DIR:
try:
# this will perform the initial clone of rally-tracks
subprocess.run(shlex.split("esrally list tracks"), text=True, capture_output=True, check=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on a central but undocumented feature of esrally list tracks sounds off. Have you considered using track.GitTrackRepository or even esrally.utils.git.RallyRepository directly by supplying it the few mandatory params it needs? This may require moving some default values out of esrally/rally.py though, which could make this approach too burdensome in practice.

it/track_repo_compatibility/conftest.py Outdated Show resolved Hide resolved
@@ -67,6 +67,8 @@ install-user: venv-create
install: install-user
# Also install development dependencies
. $(VENV_ACTIVATE_FILE); $(PIP_WRAPPER) install -e .[develop]
. $(VENV_ACTIVATE_FILE); $(PIP_WRAPPER) install git+https://github.com/elastic/pytest-rally.git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bug in pip. In any case, I think we should upload pytest-rally on PyPI since it's not expected to change much and the resulting experience would be nicer.


RALLY_HOME = os.getenv("RALLY_HOME", os.path.expanduser("~"))
RALLY_CONFIG_DIR = os.path.join(RALLY_HOME, ".rally")
RALLY_TRACKS_DIR = os.path.join(RALLY_CONFIG_DIR, "benchmarks", "tracks", "default")
Copy link
Member

@pquentin pquentin Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider another location in order to not affect and be affected by changes and checkouts in the default tracks git repository?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally intending to have these tests mimic a clean installation of Rally. By default, if a user installs Rally, it will clone rally-tracks into ~/.rally/benchmarks/tracks/default the first time an esrally command is run that requires a track repository to be available on disk.

I think this makes sense for CI, but I can see the argument for instead explicitly cloning into a distinct, non-default location for these tests, especially locally.

This commit implements the following control flow:

  • If --track-repository is provided directly, we make sure that the directory actually exists on disk, raising an exception if it doesn't.
  • If --track-repository isn't provided, we default to ~/.rally/benchmarks/tracks/rally-tracks-compat, cloning https://github.com/elastic/rally-tracks there if the directory doesn't yet exist.

Re: your suggestions here, shelling out to git does seem easier for cloning, so I went with that.

I considered making other things configurable (the remote repo URL, an option to perform an initial clone into a custom track repository path, etc.) but I think this approach serves CI and local development well enough in their default cases.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On one hand, I don't want to conflict with people's local configurations. On the other hand, I don't want additional checkouts present on my local in order to isolate testing properly.

I propose creating a rally-compat.ini which has a comment at the header # Automatically managed - do not touch and running compatibility tests via --configuration-name compat. This would specify the canonical upstream default.url = https://github.com/elastic/rally-tracks (as well as our teams and source URLs) in a place that doesn't need to be touched by users and would better avoid users shooting themselves in the foot. Is this feasible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mike and I discussed this offline. This is not a huge concern compared to the trappy nature of trying to isolate it perfectly. I agree with him that we could just move forward with this approach as-is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I also think the rally-tracks-compat checkout is the best compromise here. Thanks!

@michaelbaamonde
Copy link
Contributor Author

Thanks for the review @pquentin! I believe I've addressed your comments, so this is ready for another look whenever you're free.

Copy link
Member

@pquentin pquentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks great now! Haven't tested the CI configuration though.

@michaelbaamonde
Copy link
Contributor Author

Haven't tested the CI configuration though

It's a little cumbersome (especially for jobs triggered by PRs) unless you're conversant in the local Jenkins workflow we use internally, which maybe you are already. We can sync on this if you'd like.

Copy link
Contributor

@DJRickyB DJRickyB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@michaelbaamonde michaelbaamonde merged commit 0c86871 into elastic:master Aug 30, 2022
@pquentin pquentin added the :misc Changes that don't affect users directly: linter fixes, test improvements, etc. label Nov 2, 2022
@pquentin pquentin added this to the 2.7.0 milestone Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:misc Changes that don't affect users directly: linter fixes, test improvements, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants