Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change fork to thread #20996

Closed
wants to merge 410 commits into from
Closed

Change fork to thread #20996

wants to merge 410 commits into from

Conversation

Queimo
Copy link

@Queimo Queimo commented Dec 9, 2021

I use the latest version of wandb 0.12.7 and "fork" does not exist, only "thread" and "spawn", so I think "thread" is the right choice, worked for me, at least.

Why are these changes needed?

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

krfricke and others added 29 commits January 3, 2022 14:36
When a list with mixed types is passed to tune.choice, they will be coerced to a single dtype during sampling (due to numpy.choice converting to an array internally). This behaviour is unintentional and surprising. This PR fixes this issue.
Expands the `to_torch` method for Datasets with:
* An ability to choose to output a list/dict of feature tensors instead of just one (through setting `feature_columns` to be a list of lists or a dict of lists)
* An ability to choose whether the label should be unsqueezed or not
* An ability to pass `None` as the label (for prediction).

Furthermore, this changes how the `feature_column_dtypes` argument works. Previously, it took a list of dtypes for each feature. However, as the tensor was concatenated in the end, only one dtype mattered (the biggest one). Now, this argument expects a single dtype which will be applied to the features tensor (or a list/dict if `feature_columns` is a list of list/dict of lists).

Unit tests for all cases are included.

Co-authored-by: matthewdeng <[email protected]>
…#21260)

If we use `os.environ` to set environment variables in tests, then our tests become coupled. By using `monkeypatch`, we can safely set environment variables while ensuring our tests remain decoupled. 

For more information, see the [monkeypatching documentation](https://docs.pytest.org/en/6.2.x/monkeypatch.html#monkeypatching-environment-variables).
…t#21262)

Inheriting from `abc.ABC` is more readable than setting the meta class to `abc.ABCMeta`.

Relevant snippet from the Python 3.4 release notes:
> New class ABC has ABCMeta as its meta class. Using ABC as a base class has essentially the same effect as specifying metaclass=abc.ABCMeta, but is simpler to type and easier to read. (Contributed by Bruno Dupuis in bpo-16049.)

Co-authored-by: Richard Liaw <[email protected]>
Co-authored-by: Matthew Deng <[email protected]>
…ct#21347)

* fix

* fix test failure

* Update release/nightly_tests/dataset/ray_sgd_training.py

Co-authored-by: matthewdeng <[email protected]>
…0965)

Only `resource_avaialbe` and `resource_total` are used in raylet, so let's clear the rest before broadcasting.
…1295)

This PR refactors several components to support switching to GCS address bootstrapping later:
- Treat address from `ray.init()` and `ray` CLI as bootstrap address instead of assuming it is Redis address.
- Ray client servers support `--address` flag instead of `--redis-address`.
- A few other miscellaneous cleanup.

Also, add a test for starting non-head node with `ray start`.
Recent changes in the anyscale API rendered the current e2e script incompatible. This PR resolves these subtle API changes.
…roject#21377)

This will start repro docker containers with SYS_PTRACE capabilities to enable debugging e.g. via py-spy.
Additionally, default instance name tags for instance re-use will be generated using the buildkite build id and job id.
We use `trial.checkpoint` to restore a perturbed trial. Currently trial.checkpoint is looking at both in-memory and persistent checkpoints to find the most recent one. The definition of "the most recent one" is based on iteration. This may no longer be a valid assumption in PBT case, considering `trial_low_quantile` may have an iter=2_persistent_checkpoint as well as a iter=1_in_memory_checkpoint (perturbed from `trial_upper_quantile`).
Fixes outdated `start_training` definitions and calls in Train logging callbacks & abstract classes.
…idation` (ray-project#21374)

These two tests pass without issues on my Windows machine. Rest time out or fail.
…y-project#21232)

After this change in GCS bootstrapping mode, Redis no longer starts and `address` is treated as the GCS address of the Ray cluster.

Co-authored-by: Yi Cheng <[email protected]>
Co-authored-by: Yi Cheng <[email protected]>
…r messages. (ray-project#21219)

This PR is added to handle this comment; ray-project#20903 (comment)

The PR 
- Unifies the multiple actor died error to a single schema. (cannot unify runtime env or creation task exception)
- Improve each of actor error message to include more metadata.
- Include actor information to actor death cause.
`PublisherClient` is a more reasonable name than `SubscriberClient` since XClient means ‘client used to access X’, like GcsClient.

Besides, in the current codebase we already called this client `publisher_client`(line 329/333), while the actual class name is `SubscriberClient`, this is inconsistent.
https://github.com/ray-project/ray/blob/a8d7897a56d8dc2d2af550adee8cd5399c39dfb9/src/ray/pubsub/subscriber.cc#L326-L339
Alex Wu and others added 3 commits January 25, 2022 14:43
This PR moves the sdk to its own folder, then includes everything in `import ray.autoscaler.sdk` in ray's import path. 

Note: that there were circular dependencies in naively doing this because the ray core now uses constants that were defined in the autoscaler for internal kv operations (and the autoscaler similarly calls into the ray core). The solution was to move those internal kv keys into ray core constants so the imports flow (more) one way.

Co-authored-by: Alex Wu <[email protected]>
There are some auto-generated streaming files, which are not removed. This PR removes them totally.

Co-authored-by: 林濯 <[email protected]>
@krfricke
Copy link
Contributor

Generally yes, but the linter did not pass (you'd have to run ./scripts/format.sh).

I had to make some changes to wandb today and incorporated your changes in the PR at #21892 - I'll add you as an coauthor to the commit and will close this once #21892 is merged, ok? No further action required on your part :-)
Thanks for contributing!

# Conflicts:
#	python/ray/tune/integration/wandb.py
@Queimo
Copy link
Author

Queimo commented Jan 26, 2022

Ah I see, yes that's great! Sorry for the extra work, this was actually my first PR.

@krfricke krfricke closed this Jan 26, 2022
@krfricke
Copy link
Contributor

I've also messed up the git merge here just now, so let's use #21892 :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.