-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update long running stress tests and add actor death test. #4275
Update long running stress tests and add actor death test. #4275
Conversation
Test FAILed. |
Do the rllib tests still run? I think they are configured assuming a certain number of cores. |
@ericl yep, the tests all still seem to run. I've been running all 7 of them for 5 hours and the RLlib/Tune ones are still going strong (on m5.xlarge).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good then, this is a nice cost savings.
Test FAILed. |
This converts @stephanie-wang's actor fault tolerance test to a long running workload. This also changes the instance type to
m5.xlarge
(4 vCPUs).