Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ray component: Core] Respect the placement group when num_gpus=0, num_cpus=0 #27931

Open
JiahaoYao opened this issue Aug 17, 2022 · 5 comments
Labels
api-bug Bug in which APIs behavior is wrong bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core core-placement-group P2 Important issue, but not time-critical size:medium usability
Milestone

Comments

@JiahaoYao
Copy link
Contributor

What happened + What you expected to happen

Goal: Ideally, we want the ray remote actors or tasks to respect the placement group bundle index even when no resources is allocated by ray.

Observation: Currently, the ray actor does not respect the ray placement group when no resources are specificed; and in order to respect the placement group, one might have to manually use num_cpus=1e-4.

@rkooo567

Versions / Dependencies

Ray 2.0

Reproduction script

# Import placement group APIs.
from ray.util.placement_group import (
    placement_group,
    placement_group_table,
    remove_placement_group
)
from ray._private.services import get_node_ip_address
from ray.util.placement_group import get_current_placement_group, remove_placement_group



# Initialize Ray.
import ray
from icecream import ic
ray.init('auto')



bundle1 = {"GPU": 1, "CPU": 1}
bundle2 = {"GPU": 1, "CPU": 1}

pg = placement_group([bundle1, bundle2], strategy="STRICT_SPREAD")



ray.get(pg.ready())


ic(ray._private.state.state._available_resources_per_node())


@ray.remote(num_gpus=0, num_cpus=0)
def f():

    import time 
    print(get_node_ip_address())
    time.sleep(10)
    return True

ray.get([f.options(placement_group=pg, placement_group_bundle_index=1).remote() for i in range(10)])
ray.get([f.options(placement_group=pg, placement_group_bundle_index=0).remote() for i in range(10)])

Result:

with num_cpus=0 , both outputs are the same ip

@ray.remote(num_gpus=0, num_cpus=1e-3)
def f():

    import time 
    print(get_node_ip_address())
    time.sleep(10)
    return True

ray.get([f.options(placement_group=pg, placement_group_bundle_index=1).remote() for i in range(10)])
ray.get([f.options(placement_group=pg, placement_group_bundle_index=0).remote() for i in range(10)])

Result:

with 1e-3 , the two are different

Issue Severity

No response

@JiahaoYao JiahaoYao added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Aug 17, 2022
@rkooo567 rkooo567 added this to the Core Backlog milestone Aug 17, 2022
@rkooo567 rkooo567 added P1 Issue that should be fixed within a few weeks usability and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Aug 29, 2022
@richardliaw richardliaw added the core Issues that should be addressed in Ray Core label Oct 7, 2022
@hora-anyscale
Copy link
Contributor

Per Triage Sync: @rkooo567 can you confirm if this was fixed?

@rkooo567
Copy link
Contributor

This is not fixed.

@hora-anyscale hora-anyscale added P2 Important issue, but not time-critical and removed P1 Issue that should be fixed within a few weeks labels Nov 1, 2022
@rkooo567 rkooo567 added P1 Issue that should be fixed within a few weeks Ray 2.3 size:medium and removed P2 Important issue, but not time-critical labels Dec 7, 2022
@rkooo567
Copy link
Contributor

rkooo567 commented Dec 8, 2022

Should be fixed by 2.3

@cadedaniel
Copy link
Member

cadedaniel commented Dec 13, 2022

This will fix an issue we observed when measuring dataset preprocessing https://anyscaleteam.slack.com/archives/C041D0LT8ET/p1670628383549839

image

@rkooo567 rkooo567 added the api-bug Bug in which APIs behavior is wrong label Mar 24, 2023
@rkooo567 rkooo567 removed the Ray 2.5 label Apr 8, 2023
@fishbone fishbone linked a pull request Apr 8, 2023 that will close this issue
8 tasks
@jeremi-eta
Copy link

Hi, is this issue fixed? And if so, in which version of Ray? Thanks

@jjyao jjyao added P2 Important issue, but not time-critical and removed P1 Issue that should be fixed within a few weeks labels Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-bug Bug in which APIs behavior is wrong bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core core-placement-group P2 Important issue, but not time-critical size:medium usability
Projects
None yet
Development

No branches or pull requests

9 participants