Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ray to Cluster computing #1170

Merged
merged 1 commit into from
Feb 18, 2019
Merged

Add Ray to Cluster computing #1170

merged 1 commit into from
Feb 18, 2019

Conversation

jmargeta
Copy link
Contributor

What is this Python project?

Ray is a flexible, high-performance distributed execution framework.

  • Achieves parallelism in Python with simple and consistent API.
  • Particularly suited for machine learning (agnostic to the machine learning library of choice)
  • Forms the base of Ray's own distributed libraries for deep and reinforcement learning, processing of Pandas dataframes, and hyper parameter search.
  • Uses Plasma as its object store which allows to efficiently share large numpy arrays (or objects serializable with Apache Arrow) between the processes, avoiding unnecessary data copies and with only minimal deserialization

What's the difference between this Python project and similar ones?

  • Less overhead and lower latency with bottom up scheduling than similar projects
  • Actors - allow sharing mutable state between tasks
  • Similar to Dask, for more details see a comparison here: Comparison to dask ray-project/ray#642

--

Anyone who agrees with this pull request could vote for it by adding a 👍 to it, and usually, the maintainer will merge it when votes reach 20.

# What is this Python project?
Ray is a flexible, high-performance distributed execution framework. It achieves parallelism in Python with simple and consistent API.
Ray is particularly suited for machine learning and forms the base of libraries for deep and reinforcement learning, distributing processing of Pandas dataframes, or hyper parameter search.

# What's the difference between this Python project and similar ones?
 - Similar to Dask, see a comparison here:  ray-project/ray#642
 - Allows to efficiently share large numpy arrays (or objects serializable with Arrow) between the processes, without copying the data and with only minimal deserialization
 - Achieves lower latency with bottom up scheduling
@pcmoritz
Copy link

This looks great, shall we merge this?

@DMITRIY1988
Copy link

Нормальный Кент живет обычно ему спокойно а жизнь прилична а если деньги мне подкинет я рифму круче зафуфырю #1170

@robertnishihara
Copy link

@vinta, would you be willing to take a look at this? I think this would be a great addition to the cluster computing section.

@vinta vinta merged commit a10314a into vinta:master Feb 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants