Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Revert "[Client] chunked get requests (#22455)"" #23261

Merged
merged 2 commits into from
Mar 17, 2022

Conversation

ckw017
Copy link
Member

@ckw017 ckw017 commented Mar 16, 2022

Why are these changes needed?

Reverts #22455

PR was originally reverted because it coincided with timeouts in test_remote_package_uri. I suspect the originally PR wasn't the cause of flakiness, since the only parts that this PR could affect in that test is in the codepath for synchronous gets, and it seems unlikely that breaking synchronous gets would only break a single test.

Tested this revert multiple times against CI in #22713 and was unable to reproduce flakiness.

To be extra careful, added two changes (diff can be seen here):

  • Tighten the error handling in the proxier to cover case where an exception occurs while yielding from the raylet server stub
  • Exit early in _get_object_iterator once all chunks are yielded

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@ckw017 ckw017 changed the title Revert "Revert "[Client] chunked get requests"" Revert "Revert "[Client] chunked get requests (#22455)"" Mar 17, 2022
@ckw017 ckw017 marked this pull request as ready for review March 17, 2022 16:50
@ckw017 ckw017 requested a review from fishbone March 17, 2022 16:50
@suquark suquark merged commit 6416c65 into ray-project:master Mar 17, 2022
@jjyao
Copy link
Collaborator

jjyao commented Mar 17, 2022

Just curious, if this PR was not the root cause, do we know the actual cause of the flakiness?

@ckw017
Copy link
Member Author

ckw017 commented Mar 18, 2022

We'll find out if this PR really was the cause flakiness again soon, but if it turns out it wasn't this PR, then I suspect the cause would have been some kind of temporary degradation/outage with github, since the params for the timed out test had "https://github.com/shrekris-anyscale/test_module/archive/HEAD.zip" for the remote uri

@mwtian
Copy link
Member

mwtian commented Mar 19, 2022

test_runtime_env_working_dir_remote_uri looks to have the same flakiness before and after the PR.

scv119 pushed a commit that referenced this pull request Mar 30, 2022
* revert revertchunkedgets

* exit early if all chunks received, tighter exception handler for stream in proxy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants