Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Add a CPU-based training workload #2116

Merged
merged 2 commits into from
May 3, 2024

Conversation

kevin85421
Copy link
Member

@kevin85421 kevin85421 commented May 3, 2024

Why are these changes needed?

Our scalability test requires some CPU-based workloads. This PR uses CPU to train a MNIST model. I also can run the RayJob successfully on my devbox.

Screenshot 2024-05-03 at 10 48 05 AM

Related issue number

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@kevin85421 kevin85421 marked this pull request as ready for review May 3, 2024 19:59
@kevin85421
Copy link
Member Author

cc @andrewsykim



if __name__ == "__main__":
train_fashion_mnist(num_workers=4)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on making num_workers configurable with an environment variable? That would it make it easier to tweak in the perf tests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with CPUs for resources_per_worker

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will ask others to work on it.

@andrewsykim
Copy link
Collaborator

LGTM! Tested this locally and confirmed it works. Thanks @kevin85421!

@kevin85421 kevin85421 merged commit c099de4 into ray-project:master May 3, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants