Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Usage Data Collection in Ray #20857

Closed
richardliaw opened this issue Dec 2, 2021 · 9 comments
Closed

[RFC] Usage Data Collection in Ray #20857

richardliaw opened this issue Dec 2, 2021 · 9 comments
Labels
RFC RFC issues stale The issue is stale. It will be closed within 7 days unless there are further conversation

Comments

@richardliaw
Copy link
Contributor

richardliaw commented Dec 2, 2021

Hi all!

I'd like to get your thoughts on this below proposal which will impact a large number of Ray users.

The Ray team is proposing to enable the collection of usage statistics data in Ray, which will be used to understand how to improve Ray.

Here is the proposal. You will always be able to disable and request deletion usage data collection as described in the proposal [link].

As the Ray project and community have grown, the surface area of Ray has increased a lot, and it’s been increasingly difficult for us to understand how to best spend our time and efforts to improve Ray. We are proposing a lightweight usage statistics and data collection mechanism to help us discover and address the most pressing issues.

While many will appreciate the actionable insights this data provides, we also recognize that not everyone wants to send usage data. Again, we will make sure that you can always disable collection and/or request usage data deletion as described in the proposal [link].

Please feel free to give your feedback, on this issue or in #data-collection-feedback channel on the Ray Slack!

Update: We'll keep this RFC open for 2 weeks, starting on 12/2 to 12/16.

@ArturNiederfahrenhorst
Copy link
Contributor

I support this kind of data collection in open source projects :)

@davidbuniat
Copy link

I support it as well. Analyzing edge cases of instability and getting actionable insights will be extremely helpful to make Ray more stable.

@tgaddair
Copy link
Contributor

tgaddair commented Dec 2, 2021

This will be very informative for better understanding the community. Looking forward to the blog post summarizing the results :).

@dmatrix
Copy link
Contributor

dmatrix commented Dec 2, 2021

In general, non-intrusive data collection with goodwill and good cause to improve the usage of the platform and its libraries is a good idea. And openly sharing with the community its transparent method and intents are all good ideas!

@ArturNiederfahrenhorst
Copy link
Contributor

I support it as well. Analyzing edge cases of instability and getting actionable insights will be extremely helpful to make Ray more stable.

Absolutely. I have stumbled across 1 or 2 errors that i never reported when I was using rllib lots a year ago. Never felt like I had the time to write up an issue because I didn't understand how to reproduce them with little code.

@richardliaw richardliaw pinned this issue Dec 2, 2021
@stefanbschneider
Copy link
Member

Same here, I support and see great potential in collecting usage data - and I agree it's crucial that this remains transparent and optional. Looking forward to the resulting insights and improvements!

@richardliaw
Copy link
Contributor Author

This proposal has been up for 1 month and has not received any pushback. Thanks all for your input; we'll be proceeding with implementation now!

@ericl ericl added the RFC RFC issues label Mar 5, 2022
@amogkam amogkam unpinned this issue Apr 16, 2022
@rkooo567
Copy link
Contributor

rkooo567 commented May 31, 2022

Please check https://docs.ray.io/en/master/cluster/usage-stats.html#usage-stats-collection for more details!

@stale
Copy link

stale bot commented Sep 30, 2022

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

  • If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
  • If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC RFC issues stale The issue is stale. It will be closed within 7 days unless there are further conversation
Projects
None yet
Development

No branches or pull requests

8 participants