25 Sep 00:07

ericl

6da7eff

ray-0.7.5

Ray 0.7.5 Release Notes

Ray API

Objects created with ray.put() are now reference counted. #5590
Add internal pin_object_data() API. #5637
Initial support for pickle5. #5611
Warm up Ray on ray.init(). #5685
redis_address passed to ray.init is now just address. #5602

Core

Progress towards a common C++ core worker. #5516, #5272, #5566, #5664
Fix log monitor stall with many log files. #5569
Print warnings when tasks are unschedulable. #5555
Take into account resource queue lengths when autoscaling #5702, #5684

Tune

TF2.0 TensorBoard support. #5547, #5631
tune.function() is now deprecated. #5601

RLlib

Enhancements for TF eager support. #5625, #5683, #5705
Fix DDPG regression. #5626

Other Libraries

Complete rewrite of experimental serving library. #5562
Progress toward Ray projects APIs. #5525, #5632, #5706
Add TF SGD implementation for training. #5440
Many documentation improvements and bugfixes.

Assets 2

05 Sep 23:11

pcmoritz

ray-0.7.4

dcff263

ray-0.7.4

Ray 0.7.4 Release Notes

Highlights

There were many documentation improvements (#5391, #5389, #5175). As we continue to improve the documentation we value your feedback through the “Doc suggestion?” link at the top of the documentation. Notable improvements:
- We’ve added guides for best practices using TensorFlow and PyTorch.
- We’ve revamped the Walkthrough page for Ray users, providing a better experience for beginners.
- We’ve revamped guides for using Actors and inspecting internal state.
Ray supports memory limits now to ensure memory-intensive applications run predictably and reliably. You
can activate them through the ray.remote decorator:
```
@ray.remote(
    memory=2000 * 1024 * 1024,
    object_store_memory=200 * 1024 * 1024)
class SomeActor(object):
    def __init__(self, a, b):
        pass
```
You can set limits for the heap and the object store, see the documentation.
There is now preliminary support for projects, see the the project documentation. Projects allow you to
package your code and easily share it with others, ensuring a reproducible cluster setup. To get started, you
can run
```
# Create a new project.
ray project create <project-name>
# Launch a session for the project in the current directory.
ray session start
# Open a console for the given session.
ray session attach
# Stop the given session and all of its worker nodes.
ray session stop
```
Check out the examples. This is an actively developed new feature so we appreciate your feedback!

Breaking change: The redis_address parameter was renamed to address (#5412, #5602) and the former will be removed in the future.

Core

Move Java bindings on top of the core worker #5370
Improve log file discoverability #5580
Clean up and improve error messages #5368, #5351

RLlib

Support custom action space distributions #5164
Add TensorFlow eager support #5436
Add autoregressive KL #5469
Autoregressive Action Distributions #5304
Implement MADDPG agent #5348
Port Soft Actor-Critic on Model v2 API #5328
More examples: Add CARLA community example #5333 and rock paper scissors multi-agent example #5336
Moved RLlib to top level directory #5324

Tune

Experimental Implementation of the BOHB algorithm #5382
Breaking change: Nested dictionary results are now flattened for CSV writing: {“a”: {“b”: 1}} => {“a/b”: 1} #5346
Add Logger for MLFlow #5438
TensorBoard support for TensorFlow 2.0 #5547
Added examples for XGBoost and LightGBM #5500
HyperOptSearch now has warmstarting #5372

Other Libraries

SGD: Tune interface for Pytorch MultiNode SGD #5350
Serving: The old version of ray.serve was deprecated #5541
Autoscaler: Fix ssh control path limit #5476
Dev experience: Ray CI tracker online at https://ray-travis-tracker.herokuapp.com/

Various fixes: Fix log monitor issues #4382 #5221 #5569, the top-level ray directory was cleaned up #5404

Thanks

We thank the following contributors for their amazing contributions:

@jon-chuang, @lufol, @adamochayon, @idthanm, @RehanSD, @ericl, @michaelzhiluo, @nflu, @pengzhenghao, @hartikainen, @wsjeon, @raulchen, @TomVeniat, @layssi, @jovany-wang, @llan-ml, @ConeyLiu, @mitchellstern, @gregSchwartz18, @jiangzihao2009, @jichan3751, @mhgump, @zhijunfu, @micafan, @simon-mo, @richardliaw, @stephanie-wang, @edoakes, @akharitonov, @mawright, @robertnishihara, @lisadunlap, @flying-mojo, @pcmoritz, @jredondopizarro, @gehring, @holli, @kfstorm

Assets 2

04 Aug 02:37

simon-mo

ray-0.7.3

e4854b1

ray-0.7.3

Ray 0.7.3 Release Note

Highlights

RLlib ModelV2API is ready to use. It improves support for Keras and RNN models, as well as allowing object-oriented reuse of variables. ModelV1 API is deprecated. No migration is needed.

ray.experimental.sgd.pytorch.PyTorchTrainer is ready for early adopters. Checkout the documentation here. We welcome your feedback!

model_creator = lambda config: YourPyTorchModel()
data_creator = lambda config: YourTrainingSet(), YourValidationSet()

trainer = PyTorchTrainer(
    model_creator,
    data_creator,
    optimizer_creator=utils.sgd_mse_optimizer,
    config={"lr": 1e-4},
    num_replicas=2,
    resources_per_replica=Resources(num_gpus=1),
    batch_size=16,
    backend="auto")

for i in range(NUM_EPOCHS):
    trainer.train()

You can query all the clients that have performed ray.init to connect to the current cluster with ray.jobs(). #5076

>>> ray.jobs()
[{'JobID': '02000000',
  'NodeManagerAddress': '10.99.88.77',
  'DriverPid': 74949,
  'StartTime': 1564168784,
  'StopTime': 1564168798},
 {'JobID': '01000000',
  'NodeManagerAddress': '10.99.88.77',
  'DriverPid': 74871,
  'StartTime': 1564168742}]

Core

Improvement on memory storage handling. #5143, #5216, #4893
Improved workflow:
- Debugging tool local_mode now behaves more consistently. #5060
- Improved KeyboardInterrupt Exception Handling, stack trace reduced from 115 lines to 22 lines. #5237
Ray core:
- Experimental direct actor call. #5140, #5184
- Improvement in core worker, the shared module between Python and Java. #5079, #5034, #5062
- GCS (global control store) was refactored. #5058, #5050

RLlib

Finished port of all major RLlib algorithms to builder pattern #5277, #5258, #5249
learner_queue_timeout can be configured for async sample optimizer. #5270
reproducible_seed can be used for reproducible experiments. #5197
Added entropy coefficient decay to IMPALA, APPO and PPO #5043

Tune:

Breaking: ExperimentAnalysis is now returned by default from tune.run. To obtain a list of trials, use analysis.trials. #5115
Breaking: Syncing behavior between head and workers can now be customized (sync_to_driver). Syncing behavior (upload_dir) between cluster and cloud is now separately customizable (sync_to_cloud). This changes the structure of the uploaded directory - now local_dir is synced with upload_dir. #4450
Introduce Analysis and ExperimentAnalysis objects. Analysis object will now return all trials in a folder; ExperimentAnalysis is a subclass that returns all trials of an experiment. #5115
Add missing argument tune.run(keep_checkpoints_num=...). Enables only keeping the last N checkpoints. #5117
Trials on failed nodes will be prioritized in processing. #5053
Trial Checkpointing is now more flexible. #4728
Add system performance tracking for gpu, ram, vram, cpu usage statistics - toggle with tune.run(log_sys_usage=True). #4924
Experiment checkpointing frequency is now less frequent and can be controlled with tune.run(global_checkpoint_period=...). #4859

Autoscaler

Add a request_cores function for manual autoscaling. You can now manually request resources for the autoscaler. #4754
Local cluster:
- More readable example yaml with comments. #5290
- Multiple cluster name is supported. #4864
Improved logging with AWS NodeProvider. create_instance call will be logged. #4998

Others Libraries:

SGD:
- Example for Training. #5292
- Deprecate old distributed SGD implementation. #5160
Kuberentes: Ray namespace added for k8s. #4111
Dev experience: Add linting pre-push hook. #5154

Thanks:

We thank the following contributors for their amazing contributions:

@joneswong, @1beb, @richardliaw, @pcmoritz, @raulchen, @stephanie-wang, @jiangzihao2009, @LorenzoCevolani, @kfstorm, @pschafhalter, @micafan, @simon-mo, @vipulharsh, @haje01, @ls-daniel, @hartikainen, @stefanpantic, @edoakes, @llan-ml, @alex-petrenko, @ztangent, @gravitywp, @MQQ, @Dulex123, @morgangiraud, @antoine-galataud, @robertnishihara, @qxcv, @vakker, @jovany-wang, @zhijunfu, @ericl

Assets 2

03 Jul 05:57

simon-mo

ray-0.7.2

6e6cbb6

ray-0.7.2

Core

Improvements
- Continue moving the worker code to C++. #5031, #4966, #4922, #4899, #5032, #4996, #4875
- Add a hash table data structure to the Redis modules. #4911
- Use gRPC for communication between node managers. #4968, #5023, #5024
Python
- @ray.remote now inherits the function docstring. #4985
- Remove typing module from setup.py install_requirements. #4971
Java
- Allow users to set JVM options at actor creation time. #4970
Internal
- Refactor IDs: DriverID -> JobID, change all ID functions to camel case. #4964, #4896
- Improve organization of directory structure. #4898
Peformance
- Get task object dependencies in parallel from object store. #4775
- Flush lineage cache on task submission instead of execution. #4942
- Remove debug check for uncommitted lineage. #5038

Tune

Add directional metrics for components. #4120, #4915
Disallow setting resources_per_trial when it is already configured. #4880
Make PBT Quantile fraction configurable. #4912

RLlib

Add QMIX mixer parameters to optimizer param list. #5014
Allow Torch policies access to full action input dict in extra_action_out_fn. #4894
Allow access to batches prior to postprocessing. #4871
Throw error if sample_async is used with pytorch for A3C. #5000
Patterns & User Experience
- Rename PolicyEvaluator => RolloutWorker. #4820
- Port remainder of algorithms to build_trainer() pattern. #4920
- Port DQN to build_tf_policy() pattern. #4823
Documentation
- Add docs on how to use TF eager execution. #4927
- Add preprocessing example to offline documentation. #4950

Other Libraries

Add support for distributed training with PyTorch. #4797, #4933
Autoscaler will kill workers on exception. #4997
Fix handling of non-integral timeout values in signal.receive. #5002

Thanks

We thank the following contributors for their amazing contributions: @jiangzihao2009, @raulchen, @ericl, @hershg, @kfstorm, @kiddyboots216, @jovany-wang, @pschafhalter, @richardliaw, @robertnishihara, @stephanie-wang, @simon-mo, @zhijunfu, @ls-daniel, @ajgokhale, @rueberger, @suquark, @guoyuhong, @jovany-wang, @pcmoritz, @hartikainen, @timonbimon, @TianhongDai

Assets 2

23 Jun 21:35

richardliaw

ray-0.7.1

bc3b6ef

ray-0.7.1

Core

Change global state API. #4857
- ray.global_state.client_table() -> ray.nodes()
- ray.global_state.task_table() -> ray.tasks()
- ray.global_state.object_table() -> ray.objects()
- ray.global_state.chrome_tracing_dump() -> ray.timeline()
- ray.global_state.cluster_resources() -> ray.cluster_resources()
- ray.global_state.available_resources() -> ray.available_resources()
Export remote functions lazily. #4898
Begin moving worker code to C++. #4875, #4899, #4898
Upgrade arrow to latest master. #4858
Upload wheels to S3 under <branch-name>/<commit-id>. #4949
Add hash table to Redis-Module. #4911
Initial support for distributed training with PyTorch. #4797

Tune

Disallow setting resources_per_trial when it is already configured. #4880
Initial experiment tracking support. #4362

RLlib

Begin deprecating Python 2 support in RLlib. #4832
TensorFlow 2 compatibility. #4802
Allow Torch policies access to full action input dict in extra_action_out_fn. #4894
Allow access to batches prior to postprocessing. #4871
Port algorithms to build_trainer() pattern. #4823
Rename PolicyEvaluator -> RolloutWorker. #4820
Rename PolicyGraph -> Policy, move from evaluation/ to policy/. #4819
Support continuous action distributions in IMPALA/APPO. #4771

(Revision: 6/23/2019 - Accidentally included commits that were not part of the release.)

Assets 2

18 May 22:13

devin-petersohn

ray-0.7.0

1490a98

ray-0.7.0

Core

Backend bug fixes. #4766, #4763, #4605
Add experimental API for creating resources at runtime. #3742

Tune

Post-Experiment Tools. #4351
Add Ax to Tune. #4731
Tune bug fixes. #4733, #4659, #4747

RLlib

Remove dependency on TensorFlow. #4764
TD3/DDPG improvements and MuJoCo benchmarks. #4694
Evaluation mode implementation for rllib.Trainer class. #4647
Replace ray.get() with ray_get_and_free() to automatically free object store memory. #4586
RLLib bug fixes. #4736, #4735, #4652, #4630

Autoscaler

Add an aggressive autoscaling flag. #4285
Autoscalar bug fixes. #4782, #4653

Assets 2

19 Apr 05:47

devin-petersohn

ray-0.6.6

618147f

ray-0.6.6

Core

Add delete_creating_tasks option for internal.free() #4588

Tune

Add filter flag for Tune CLI. #4337
Better handling of tune.function in global checkpoint. #4519
Add compatibility to nevergrad 0.2.0+. #4529
Add --columns flag for CLI. #4564
Add checkpoint eraser. #4490
Fix checkpointing for Gym types. #4619

RLlib

Report sampler performance metrics. #4427
Ensure stats are consistently reported across all algos. #4445
Cleanup TFPolicyGraph. #4478
Make batch timeout for remote workers tunable. #4435
Fix inconsistent weight assignment operations in DQNPolicyGraph. #4504
Add support for LR schedule to DQN/APEX. #4473
Add option for RNN state and value estimates to span episodes. #4429
Create a combination of ExternalEnv and MultiAgentEnv, called ExternalMutliAgentEnv. #4200
Support prev_state/prev_action in rollout and fix multiagent. #4565
Support torch device and distributions. #4553

Java

TestNG outputs more verbose error messages. #4507
Implement GcsClient. #4601
Avoid unnecessary memory copy and addd a benchmark. #4611

Autoscaler

Add support for separate docker containers on head and worker nodes. #4537
Add an aggressive autoscaling flag. #4285

Assets 2

25 Mar 21:18

robertnishihara

ray-0.6.5

01747b1

ray-0.6.5

Core

Build system fully converted to Bazel. #4284, #4280, #4281
Introduce a set data structure in the GCS. #4199
Make all arguments to _remote() optional. #4305
Improve object transfer latency by setting TCP_NODELAY on all TCP connections. #4318
Add beginning of experimental serving module. #4095
Remove Jupyter notebook based UI. #4301
Add ray timeline command line command for dumping Chrome trace. #4239

Tune

Add custom field for serializations. #4237
Begin adding Tune CLI. #3983, #4321, #4322
Add optimization to reuse actors. #4218
Add warnings if the Tune event loop gets clogged. #4353
Switch preferred API from tune.run_experiments to tune.run. #4234
Make the logging from the function API consistent and predictable. #4011

RLlib

Breaking: Flip sign of entropy coefficient in A2C and Impala. #4374
Add option to continue training even if some workers crash. #4376
Add asynchronous remote workers. #4253
Add callback accessor for raw observations. #4212

Java

Improve single-process mode. #4245, #4265
Package native dependencies into jar. #4367
Initial support for calling Python functions from Java. #4166

Autoscaler

Restore error messages for setup errors. #4388

Known Issues

Object broadcasts on large clusters are inefficient. #2945

Assets 2

06 Mar 01:03

pcmoritz

ray-0.6.4

fa8c07d

ray-0.6.4

Breaking

Removed redirect_output and redirect_worker_output from ray.init, removed deprecated _submit method. #4025
Move TensorFlowVariables to ray.experimental.tf_utils. #4145

Core

Stream worker logging statements to driver by default. #3892
Added experimental ray signaling mechanism, see the documentation. #3624
Make Bazel the default build system. #3898
Preliminary experimental streaming API for Python. #4126
Added web dashboard for monitoring node resource usage. #4066
Improved propagation of backend errors to user. #4039
Many improvements for the Java frontend. #3687, #3978, #4014, #3943, #3839, #4038, #4039, #4063, #4100, #4179, #4178
Support for dataclass serialization. #3964
Implement actor checkpointing. #3839
First steps toward cross-language invocations. #3675
Better defaults for Redis memory usage. #4152

Tune

Breaking: Introduce ability to turn off default logging. Deprecates custom_loggers. #4104
Support custom resources. #2979
Add initial parameter suggestions for HyperOpt. #3944
Add scipy-optimize to Tune. #3924
Add Nevergrad. #3985
Add number of trials to the trial runner logger. #4068
Support RESTful API for the webserver. #4080
Local mode support. #4138
Dynamic resources for trials. #3974

RLlib

Basic infrastructure for off-policy estimation. #3941
Add simplex action space and Dirichlet action distribution. #4070
Exploration with parameter space noise. #4048
Custom supervised loss API. #4083
Add torch policy gradient implementation. #3857

Autoscaler and Cluster Setup

Add docker run option (e.g. to support nvidia-docker). #3921

Modin

Upgrade Modin to 0.3.1, see the release notes. #4058

Known Issues

Object broadcasts on large clusters are inefficient. #2945
IMPALA is broken #4329

Assets 2

06 Mar 00:11

stephanie-wang

ray-0.6.3

d2b6db3

ray-0.6.3

Core

Initial work on porting the build system to Bazel. #3918, #3806, #3867, #3842
Allow starting Ray processes inside valgrind, gdb, tmux. #3824, #3847
Stability improvements and bug fixes. #3861, #3962, #3958, #3855, #3736, #3822, #3821, #3925
Convert Python C extensions to Cython. #3541
ray start can now be used to start Java workers. #3838, #3852
Enable LZ4 compression in pyarrow build. #3931
Update Redis to version 5.0.3. #3886
Use one memory-mapped file for Plasma store. #3871,

Tune

Support for BayesOpt. #3864
Support for SigOpt. #3844
Support executing infinite recovery retries for a trial. #3901
Support export_formats option to export policy graphs. #3868
Cluster and logging improvements. #3906

RLlib

Support for Asynchronous Proximal Policy Optimization (APPO). #3779
Support for MARWIL. #3635
Support for evaluation option in DQN. #3835
Bug fixes. #3865, #3810, #3938
Annotations for API stability. #3808

Autoscaler and Cluster Setup

Faster cluster launch and update. #3720
Bug fixes. #3916, #3860, #3937, #3782, #3969
Kubernetes configuration improvements. #3875, #3909

Modin

Update Modin to 0.3.0. #3936
- Modin 0.3.0 release notes

Known Issues

Object broadcasts on large clusters are inefficient. #2945

Assets 2

Releases: ray-project/ray

ray-0.7.5

Ray 0.7.5 Release Notes

Ray API

Core

Tune

RLlib

Other Libraries

ray-0.7.4

Ray 0.7.4 Release Notes

Highlights

Core

RLlib

Tune

Other Libraries

Thanks

ray-0.7.3

Ray 0.7.3 Release Note

Highlights

Core

RLlib

Tune:

Autoscaler

Others Libraries:

Thanks:

ray-0.7.2

Core

Tune

RLlib

Other Libraries

Thanks

ray-0.7.1

Core

Tune

RLlib

ray-0.7.0

Core

Tune

RLlib

Autoscaler

ray-0.6.6

Core

Tune

RLlib

Java

Autoscaler

ray-0.6.5

Core

Tune

RLlib

Java

Autoscaler

Known Issues

ray-0.6.4

Breaking

Core

Tune

RLlib

Autoscaler and Cluster Setup

Modin

Known Issues

ray-0.6.3

Core

Tune

RLlib

Autoscaler and Cluster Setup

Modin

Known Issues