-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] add tmpfs mounting support in rayci #42247
Conversation
5b21cca
to
dd8698e
Compare
interestingly, it seems that it will make |
dd8698e
to
fccd37a
Compare
my guess is that there is an event queue or something, and |
fccd37a
to
ebe3e92
Compare
Are there any downside for using tempfs universally for all linux tests instead of having that as an option? |
@@ -75,6 +75,16 @@ steps: | |||
--test_tag_filters=mem_pressure -- //python/ray/tests/... | |||
job_env: corebuild | |||
|
|||
- label: ":ray: core: out of disk tests" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing the redis version of the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -247,6 +247,10 @@ def foo(): | |||
except ray.exceptions.RayTaskError as e: | |||
assert isinstance(e.cause, ray.exceptions.OutOfDiskError) | |||
|
|||
# Give it some time for events to appear. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can wrap the validation logic in a wait_for_condition
like this:
ray/python/ray/tests/test_state_api.py
Lines 2307 to 2316 in 583e6fb
def verify(): | |
events = list_cluster_events() | |
print(events) | |
assert len(events) == 1 | |
assert ( | |
"Error: No available node types can fulfill " "resource request" | |
) in events[0]["message"] | |
return True | |
wait_for_condition(verify) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that feels worse. it is mixing result checking and waiting for result coming back, and treating wrong results as acceptable.
waiting for 5 secs is not too bad here for now. will leave to the core team to refactor down the road.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is changed to wait for 2 seconds now fwiw.
in theory, no, but it does not work.. many more tests will fail, and I am not really interested on investigating and fixing all of those. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ebe3e92
to
9475efb
Compare
the redis test fails somehow :( |
9475efb
to
f5ec951
Compare
Signed-off-by: Lonnie Liu <[email protected]>
f5ec951
to
f0bc974
Compare
changed test size, and decreased the wait. seems that redis tests run slower. |
Signed-off-by: Lonnie Liu <[email protected]>
cherrypick #42247 pure test code change. no code change. Signed-off-by: Lonnie Liu <[email protected]>
and run out of disk test with it.
on newer version of docker uses newer version of overlayfs, which calculates free disk space differently and will fail the test. using tmpfs will make the test pass again.