-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the check for internet connection #1517
Remove the check for internet connection #1517
Conversation
For context, passing @dliappis @danielmitterdorfer What do you think? |
@pquentin I am good with the direction highlighted here, however, I have a comment. In this PR we are removing the informative message that we had so far, "Could not detect a working Internet connection". If internet is down, when using e.g.
I think it's a shame if we aren't helpful here to at least point to the user that there is an offline mode. We know from experience that not everyone spends the time to read the docs and IMHO we are missing a chance to be more user friendly here. On another note, if we are to run with the reproduction mentioned in #1506 (comment) i.e. using an invalid |
8f3fcf0
to
39a2715
Compare
Goog catch! I digged into it. The reason the message disappeared is because apparently, git doesn't respect irinatruong at eli in ~/.rally/benchmarks/tracks/default on master
> HTTP_PROXY=http://invalid git fetch origin master
From https://github.com/elastic/rally-tracks
* branch master -> FETCH_HEAD
irinatruong at eli in ~/.rally/benchmarks/tracks/default on master
> http_proxy=http://invalid git fetch origin master
From https://github.com/elastic/rally-tracks
* branch master -> FETCH_HEAD
irinatruong at eli in ~/.rally/benchmarks/tracks/default on master
> git -c 'http.proxy=http://invalid' fetch origin master
fatal: unable to access 'https://github.com/elastic/rally-tracks/': Could not resolve proxy: invalid I addressed it, and the warning is back now: > http_proxy=http://invalid esrally list tracks
____ ____
/ __ \____ _/ / /_ __
/ /_/ / __ `/ / / / / /
/ _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
/____/
[WARNING] Could not update tracks. Continuing with your locally available state. There doesn't seem to be an issue tracker for https://github.com/git/git Because of this, I was unable to find out if not respecting HTTP_PROXY is a known issue / recent bug. |
39a2715
to
de0d818
Compare
de0d818
to
33cdee5
Compare
So, I was completely wrong there. Git does the right thing, I wasn't! I should have set Going to revert the |
…p.proxy' instead." This reverts commit 33cdee5.
Also, rally should really support |
Thanks for iterating! I am not sure that the latest commits have changed the behavior when specifying a proxy; I checked out the latest state of this PR and tried setting all proxy vars (for completeness sake):
and all Rally commands still seem to bypass the invalid proxy:
Should things fail? e.g. curl does:
|
@dliappis Did you intend to run |
@pquentin good point. I checked again and it seems to provide the right output, sorry for the noise here.
I have one final concern, which I raised in my first comment whether we could provide a breadcrumb to our docs about the offline mode when internet access has failed. I realize that this can happen due to a genuinely offline scenario, but also due to a bad proxy setting, but e.g. in a situation where there is no internet right now we don't mention anything about the existence of an offline mode -- which we used to prior to this PR:
The out of the box experience (when there is no state) shows:
Would it make sense to mention the offline mode: https://esrally.readthedocs.io/en/stable/offline.html?highlight=offline#offline-usage when the latter (could not clone) occurs? It's not a huge issue, but it could help users find their way to the docs about how to run offline. |
Yes, I agree that we need to provide more informative messages in those common cases. It could be enough to report the actual error from git/urllib3 instead of burying that in the logs. |
Adding |
This is now ready for review. One integration test is still failing:
Not sure what's up with that, as it passes locally. Looking into. |
Tests are now passing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks good and works well. I left a few questions.
The experience when listing tracks is great, but it's a bit less so when there's missing track data that can't be downloaded:
esrally race --track=http_logs --test-mode --distribution-version=8.2.0
____ ____
/ __ \____ _/ / /_ __
/ /_/ / __ `/ / / / / /
/ _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
/____/
[WARNING] Could not determine a socket address. Are you running without any network? Switching to degraded mode.
[INFO] Race id is [2d09ad2c-c738-497e-9436-296a4da0ae06]
[WARNING] Could not update tracks. Continuing with your locally available state. Original error: Could not fetch source tree from [origin] You may need to specify --offline if running without Internet connection.
[WARNING] Could not update teams. Continuing with your locally available state. Original error: Could not fetch source tree from [origin] You may need to specify --offline if running without Internet connection.
[INFO] Preparing for race ...
[WARNING] Could not update teams. Continuing with your locally available state. Original error: Could not fetch source tree from [origin] You may need to specify --offline if running without Internet connection.
[WARNING] Could not update tracks. Continuing with your locally available state. Original error: Could not fetch source tree from [origin] You may need to specify --offline if running without Internet connection.
[WARNING] Could not update tracks. Continuing with your locally available state. Original error: Could not fetch source tree from [origin] You may need to specify --offline if running without Internet connection.
[WARNING] Could not update tracks. Continuing with your locally available state. Original error: Could not fetch source tree from [origin] You may need to specify --offline if running without Internet connection.
[ERROR] Cannot race. Error in task executor
HTTPSConnectionPool(host='rally-tracks.elastic.co', port=443): Max retries exceeded with url: /http_logs/documents-181998-1k.json.bz2 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7ff7f3aceb50>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
Getting further help:
*********************
* Check the log files in /home/q/.rally/logs for errors.
* Read the documentation at https://esrally.readthedocs.io/en/latest/.
* Ask a question on the forum at https://discuss.elastic.co/tags/c/elastic-stack/elasticsearch/rally.
* Raise an issue at https://github.com/elastic/rally/issues and include the log files in /home/q/.rally/logs.
--------------------------------
[INFO] FAILURE (took 23 seconds)
--------------------------------
I think it's still acceptable, but I'm surprised by all the apparent retries and wonder if we want to attach MSG_NO_CONNECTION
for the actual error.
docs/configuration.rst
Outdated
@@ -158,7 +157,7 @@ If the configuration is correct, git will clone this repository. You can delete | |||
|
|||
To verify that Rally will connect via the proxy server you can check the log file. If the proxy server is configured successfully, Rally will log the following line on startup:: | |||
|
|||
Rally connects via proxy URL [http://proxy.acme.org:3128/] to the Internet (picked up from the environment variable [http_proxy]). | |||
Connecting via proxy URL [http://proxy.acme.org:3128/] to the Internet (picked up from the env variable [http_proxy]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd prefer if we used "environment" actually in the logs instead of env
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah. I updated the doc to match the actual message, but I guess I can update the message!
esrally/utils/net.py
Outdated
@@ -250,31 +263,16 @@ def download(url, local_path, expected_size_in_bytes=None, progress_indicator=No | |||
|
|||
|
|||
def retrieve_content_as_string(url): | |||
with __http().request("GET", url, timeout=urllib3.Timeout(connect=45, read=240)) as response: | |||
with __http(url).request("GET", url, timeout=urllib3.Timeout(connect=45, read=240)) as response: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing the url twice is weird, can we maybe change __http
to _request(method, url, **kwargs)
? Can be a follow-up
(I don't think there's any reason to have name mangling here: we should probably switch all double underscores to single underscores.)
esrally/utils/net.py
Outdated
if not env_var == "http_proxy": | ||
# http_proxy can only be lowercase | ||
# see https://curl.se/mail/archive-2001-12/0034.html | ||
proxy_url = os.getenv(env_var.upper()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why curl decided to forbid HTTP_PROXY
20 years ago, but I don't understand why this applies to us? It's not like running Rally in a CGI script makes any sense.
I think allowing any case for all proxy variables is fine.
it/proxy_test.py
Outdated
# authenticated proxy access is allowed | ||
assert_log_line_present(fresh_log_file, "Detected a working Internet connection") | ||
env["https_proxy"] = http_proxy.authenticated_url | ||
lines = process.run_subprocess_with_output(it.esrally_command_line_for(cfg, "list tracks"), env=env) | ||
output = "\n".join(lines) | ||
assert "[WARNING] Could not update tracks." not in output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the authenticated URL supposed to allow updating tracks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm asserting that there's no warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I totally missed that. Would it be more obvious with pytest.mark.parametrize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean having one test to test both authenticated and anonymous url? Or which variable would be parametrized? I added some comments to clarify what's going on.
it/basic_test.py
Outdated
assert expected in "\n".join(output) | ||
cmd = it.esrally_command_line_for(cfg, "list tracks") | ||
with tempfile.TemporaryDirectory() as tmpdir: | ||
env = dict(os.environ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I found os.environ.copy()
more explicit.
Out of curiosity, is there any reason to have moved the logic from run_subprocess_with_output
to those integration tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the env_vars
was something that I added in one of the previous PRs, and it did NOT represent the full env. It was like "I have these additional variables, please add/update them in the environment". I decided that passing in a complete env is more explicit, less confusing, and consistent with other subprocess api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it's better! Leaving unresolved for dict(os.environ)
How did you simulate that? |
I removed my |
Yes, the multiple calls to fetch the git repo is not pretty. This happens because we call
Not sure where the 3rd call comes from though, and TBH, I don't completely understand the logic with actors. Do you know what is going on there? Seems complicated to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! This is great, thanks for iterating. Can you please make sure merging will close the relevant issues? "address" is not a supported verb: https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword
esrally/exceptions.py
Outdated
super().__init__(message, cause) | ||
super().__init__(message) | ||
self.message = message | ||
self.cause = cause | ||
|
||
def __repr__(self): | ||
return self.message | ||
|
||
def __str__(self): | ||
return self.message | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this is really cool! This will solve a LOT of issues we had with unprintable exceptions that I never figured out. (I wonder why we're not using cause by the way)
(To be clear, I don't think the various git calls are an issue.) |
…e-check-connection
Remove the check for Rally being online. Let it fail when trying to update the repositories instead.
Closes #1506.
Alternative to #1514.