Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core]Add RAY_REDIS_ADDRESS environment to specify external address. #20966

Merged
merged 7 commits into from
Dec 31, 2021

Conversation

MissiontoMars
Copy link

@MissiontoMars MissiontoMars commented Dec 8, 2021

Why are these changes needed?

Support RAY_REDIS_ADDRESS environment variable option when ray start.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Copy link
Member

@mwtian mwtian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

python/ray/scripts/scripts.py Outdated Show resolved Hide resolved
python/ray/scripts/scripts.py Outdated Show resolved Hide resolved
# Test starting Ray with --external-kv-storage-config.
check_call_ray([
"start", "--head", "--external-kv-storage-config",
"redis:127.0.0.1:6379", "--port", "0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use redis:// ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also think redis:// is better. @mwtian

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted with Yi Cheng this afternoon. I think the format we choose should support the following settings:

  1. multiple Redis ip:port.
  2. Redis password, together with (1) above.
  3. non-redis backend in future, with potentially more complex configs.

With redis://, would it be --external-kv-storage-config=redis://123:4?password=xxx,redis://124:5?password=yyy,redis://125:6?password=zzz?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's in my mind is that if we are using redis:// kind of format, it'll be easier for future extension and also you can pass parameter in the URL and use urlparser in py to decode them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm We need to support passing a separate password for each redis host. Currently only one same password used for multiple external redis.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my understanding, the --redis-password option may be removed, if password not specified in cli the default password used.
e.g. For --external-kv-storage-config=redis://host1:port1,redis://host2:port2?password=yyy:
the default password used for host1, and password=yyy used for host2.

Copy link
Member

@mwtian mwtian Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Sorry for not being clear at the beginning. Ideally we can move all redis password and redis shard settings to be under this flag. How about not supporting the --redis-password flag as default value?

Please let us know if you see other Redis configurations we should support here.

Copy link
Author

@MissiontoMars MissiontoMars Dec 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm After thinking it over, I doubt whether we really need to set multiple different passwords for multiple redis? I'm afraid in most of the time, users will set the same password for multiple redis. Maybe there be some over design here.
Currently, only one password required although multiple redis provided. See #13170 @iycheng @mwtian

@MissiontoMars MissiontoMars changed the title [WIP]Add external kv storage config option. [Core]Add external kv storage config option. Dec 10, 2021
@fishbone
Copy link
Contributor

Could we change the flag to --gcs-kv-storage-config ? We later might have storage for other components, for example, object-storage-config/ray-storage-config. it'll be good to make it explicit.

besides, I don't think we need external there, it can be memory too. later we might be able to set the maximum memory limit for this.

@mwtian
Copy link
Member

mwtian commented Dec 10, 2021

Could we change the flag to --gcs-kv-storage-config ? We later might have storage for other components, for example, object-storage-config/ray-storage-config. it'll be good to make it explicit.

besides, I don't think we need external there, it can be memory too. later we might be able to set the maximum memory limit for this.

I'm ok with this alternative, or --gcs-metadata-storage-config. My concern is whether we should hard code gcs into the flag name. If in future KV and / or pubsub are moved to separate processes, or GCS no longer exists, the gcs name would become misleading.

@fishbone
Copy link
Contributor

Could we change the flag to --gcs-kv-storage-config ? We later might have storage for other components, for example, object-storage-config/ray-storage-config. it'll be good to make it explicit.
besides, I don't think we need external there, it can be memory too. later we might be able to set the maximum memory limit for this.

I'm ok with this alternative, or --gcs-metadata-storage-config. My concern is whether we should hard code gcs into the flag name. If in future KV and / or pubsub are moved to separate processes, or GCS no longer exists, the gcs name would become misleading.

Yep, we can name it with function prefix maybe, --ray-metadata-storage-config ?

@fishbone fishbone added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Dec 14, 2021
@MissiontoMars MissiontoMars changed the title [Core]Add external kv storage config option. [Core]Add RAY_REDIS_ADDRESS environment to specify external address. Dec 20, 2021
python/ray/scripts/scripts.py Outdated Show resolved Hide resolved
@mwtian
Copy link
Member

mwtian commented Dec 22, 2021

cc @iycheng

@mwtian mwtian removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Dec 22, 2021
@mwtian mwtian requested a review from fishbone December 23, 2021 05:01
Copy link
Contributor

@jovany-wang jovany-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we need update the doc page as well? But anyway it could be updated in another PR.
@mwtian CC

@jovany-wang
Copy link
Contributor

Java#testIsolationInTheSameNamespaces, tests:test_runtime_env also failed on other commits.
tune:test_trial_scheduler_pbt is not related.

retrying boostraup_env...

@jovany-wang jovany-wang added the testing topics about testing label Dec 31, 2021
@jovany-wang jovany-wang self-assigned this Dec 31, 2021
@jovany-wang
Copy link
Contributor

Java#testIsolationInTheSameNamespaces, tests:test_runtime_env also failed on other commits. tune:test_trial_scheduler_pbt is not related.

retrying boostraup_env...


boostraup_env succeeded. merging this.

@jovany-wang jovany-wang merged commit 412cd6b into ray-project:master Dec 31, 2021
krfricke added a commit that referenced this pull request Jan 27, 2022
The WandbLoggingCallback is run on the driver side, with the experiment directory was the cwd. Using resume=True will pick up state from other trials (as the file name is global), and thus lead to warning messages. Thus, we should default to resume=False when using the callback.
This PR also incorporates changes from #20966.

Co-authored by: Queimo <[email protected]>
Co-authored by: Karim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing topics about testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants