Tidy custom resources #336

DmitriGekhtman · 2022-07-03T19:35:46Z

Why are these changes needed?

Background

Closes #167

Ray start has a resources option which allows users to specify custom resource capacities for a Ray node.
Custom resources serve as hints to the Ray scheduler and autoscaler: If a Ray node has custom resource {"BLAH": 10}, you can schedule 10 tasks decorated with @ray.remote(resources={"BLAH": 1}). Attempting to schedule an 11th such task will cause the autoscaler to bring up another Ray node to accommodate the request.

One way to use the resources field in the context of KubeRay is described in #167:

A use case for this might be to deploy a heterogenous cluster with multiple worker groups, where each worker group uses a different image packaged with different 3rd-party utilities. Some tasks that require specific utilities could then be marked as requiring such resource, and only execute on the workers that provide it.

Problem

The value of resources must be supplied as a json string, which makes it awkward to specify correctly in a yaml file:

rayStartParams:
   resources: "'{\"customRes\": 1, \"anotherOne\": 2}'"

Solution

To provide a more convenient interface, this PR introduces an optional map field rayCustomResources.

Before:

rayStartParams:
   resources: "'{\"customRes\": 1, \"anotherOne\": 2}'"

After:

rayCustomResources:
    customRes: 1
    anotherOne: 2

Properties of the change in this PR

Purely cosmetic. Simply allows specifying Ray custom resources as a map rather than as a carefully quoted/escaped JSON string.
- We can consider deeper integration with K8s extended resources later.
Backwards-compatible -- specifying rayStartParams["resources"] will continue to work.

An upcoming Ray PR would update the autoscaler to read the newly introduced field when determining node resource capacities.

Note on CPU/GPU/memory

The resources option for ray start does not support the keys CPU, GPU, or memory. These fields are inferred from the container spec and can be overridden with the optional ray start parameters num-cpus, num-gpus, memory.

Attempting to supply keys CPU, GPU, or memory to resources will cause Ray start to fail with an error message telling the user not to do that.

I considered adding CRD validation to exclude these keys. However, Kubernetes does not support "patternProperties" in CRDs -- you can't use a regex to restrict the keys of a map/object. A potential workaround would be to use a list of key, value pairs and add validation for key. However, that would make the interface and the code more complicated.

We already do some validation of rayStartParams in the operator code.
TODO in a follow-up PR: Validate absence of CPU, GPU, memory keys in the same spot, after #334 is merged.

Autoscaler

A follow-up Ray PR would have the autoscaler consider the rayCustomResources field when determining resource capacities.

Related issue number

Closes #167

Checks

I've made sure the tests are passing.
Testing Strategy
- Unit tests
- Manual tests
- This PR is not tested :(

ray-operator/controllers/ray/common/pod.go

DmitriGekhtman · 2022-07-03T19:46:08Z

ray-operator/go.sum

@@ -329,8 +331,6 @@ github.com/mitchellh/go-homedir v1.0.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrk
 github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0=
 github.com/mitchellh/go-testing-interface v1.0.0/go.mod h1:kRemZodwjscx+RGhAo8eIhFbs2+BFgRtFPeD/KE+zxI=
 github.com/mitchellh/gox v0.4.0/go.mod h1:Sd9lOJ0+aimLBi73mGofS1ycjY8lL3uZM3JPS42BGNg=
-github.com/mitchellh/hashstructure/v2 v2.0.2 h1:vGKWl0YJqUNxE8d+h8f6NJLcCJrgbhC4NcD46KavDd4=
-github.com/mitchellh/hashstructure/v2 v2.0.2/go.mod h1:MG3aRVU/N29oo/V/IhBX8GR/zz4kQkprJgF2EVszyDE=


looks like this one was unused and got removed by go mod tidy

Jeffwan · 2022-07-04T16:57:24Z

ray-operator/apis/ray/v1alpha1/raycluster_types.go

+	// capacities of Ray pods in this group.
+	// Equivalent to RayStartParams.resources, which must be specified as a json string.
+	// Disallowed keys: "GPU", "CPU", "memory", "object_store_memory". Instead, use rayStartParams["num-cpus"] etc.
+	RayCustomResources map[string]int64 `json:"rayCustomResources,omitempty"`


I am thinking in the real world cases, normally we need to claim the resource twice ( kubernetes level and ray level). for example, If I need an inference chip like aws.amazon.com/neuron: 1.

kubernetes level configuration

resources: limits: aws.amazon.com/neuron: 1

ray level configuration

rayCustomResources aws.amazon.com/neuron: 1

Is that expected? Not sure if the resource name are always same between kubernetes and ray. If so, we probably can use kubernetes resource as source of truth to build ray resource configurations. If not, I think separation is necessary. (I kind of forget our conclusion in last community meeting)

For now, I'd prefer to keep the configuration separate -- this PR isn't introducing new behavior, just cleaning up the existing interface.
rayCustomResources is an alias for rayStartParams["resources"]. You use a map instead of a triply-quoted json string.

For the use-case described in #167 (multiple image types) it's not needed to set up a Kubernetes extended resource.
The Ray resource annotation is just used as a hint to the Ray scheduler to schedule certain tasks on workers using certain images.
For use-cases like this, it's too heavy-weight to require the user to patch nodes with extended resource capacities (or run a daemonset to do the same).

In the other direction, if your Ray pod already uses K8s extended resources, it could be useful for Ray to automatically detect those as "custom resources". Right now, we do that only with "/gpu" extended resources, but we might want to generalize the behavior. We could make this generalization later, if that's desired by users.

Summarizing my opinion:

If a pod uses a K8s extended resource, it may be useful to automatically advertise it to Ray as a "custom resource". (But we're not doing that in this PR.)

Advertising a "custom resource" to Ray should not require labelling K8s nodes with extended resources.

Summarize looks good to me. We can do it stage by stage

Jeffwan · 2022-07-05T21:42:46Z

The change looks good to me. Please help rebase the master change

akanso · 2022-07-05T23:00:28Z

@DmitriGekhtman do we have a use-case where the RayCustomResource is not a physical HW on the node (like a TPU or NPU)?

DmitriGekhtman · 2022-07-05T23:14:02Z

@DmitriGekhtman do we have a use-case where the RayCustomResource is not a physical HW on the node (like a TPU or NPU)?

Yeah, this one #167.

DmitriGekhtman · 2022-07-05T23:15:33Z

Paraphrasing @pcmoritz from offline:
As we're a bit unsure where we're going with resource configuration, it may be best to just leave things as they are, and just document the format "'{\"customRes\": 1, \"anotherOne\": 2}'".

I think this probably makes sense -- the configuration is a little awkward, but the functionality is already present.

DmitriGekhtman · 2022-07-05T23:19:24Z

@DmitriGekhtman do we have a use-case where the RayCustomResource is not a physical HW on the node (like a TPU or NPU)?

Basically, any time you have multiple worker groups and you want certain tasks to be scheduled on a specific worker group.
The reason for scheduling on a given node group may or may not be related to physical hardware.

DmitriGekhtman · 2022-07-05T23:24:01Z

Will close for now and consider reopening if there are further complaints about the format.

DmitriGekhtman · 2022-07-05T23:35:47Z

@akanso as a related item, we can consider automatically advertising K8s extended resources to Ray by inserting those into the relevant rayStartParam

DmitriGekhtman added 6 commits July 2, 2022 13:58

wip

df6a131

wip

da25788

Fill in the resources.

b3259ec

wip

652f537

wip

49c5b3c

fix

b97b37b

DmitriGekhtman requested review from Jeffwan and akanso July 3, 2022 19:35

DmitriGekhtman assigned pcmoritz, wuisawesome and sriram-anyscale Jul 3, 2022

DmitriGekhtman marked this pull request as ready for review July 3, 2022 19:35

lint

927bbe7

DmitriGekhtman commented Jul 3, 2022

View reviewed changes

ray-operator/controllers/ray/common/pod.go Show resolved Hide resolved

DmitriGekhtman commented Jul 3, 2022

View reviewed changes

Jeffwan reviewed Jul 4, 2022

View reviewed changes

DmitriGekhtman requested a review from Jeffwan July 5, 2022 21:31

Jeffwan approved these changes Jul 5, 2022

View reviewed changes

Merge branch 'master' into dmitri/custom-resources

00ac146

DmitriGekhtman closed this Jul 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tidy custom resources #336

Tidy custom resources #336

DmitriGekhtman commented Jul 3, 2022 •

edited

Loading

DmitriGekhtman Jul 3, 2022

Jeffwan Jul 4, 2022

DmitriGekhtman Jul 4, 2022 •

edited

Loading

DmitriGekhtman Jul 5, 2022 •

edited

Loading

Jeffwan Jul 5, 2022 •

edited

Loading

Jeffwan commented Jul 5, 2022

akanso commented Jul 5, 2022

DmitriGekhtman commented Jul 5, 2022

DmitriGekhtman commented Jul 5, 2022 •

edited

Loading

DmitriGekhtman commented Jul 5, 2022

DmitriGekhtman commented Jul 5, 2022

DmitriGekhtman commented Jul 5, 2022

Tidy custom resources #336

Tidy custom resources #336

Conversation

DmitriGekhtman commented Jul 3, 2022 • edited Loading

Why are these changes needed?

Background

Problem

Solution

Properties of the change in this PR

Note on CPU/GPU/memory

Autoscaler

Related issue number

Checks

DmitriGekhtman Jul 3, 2022

Choose a reason for hiding this comment

Jeffwan Jul 4, 2022

Choose a reason for hiding this comment

DmitriGekhtman Jul 4, 2022 • edited Loading

Choose a reason for hiding this comment

DmitriGekhtman Jul 5, 2022 • edited Loading

Choose a reason for hiding this comment

Jeffwan Jul 5, 2022 • edited Loading

Choose a reason for hiding this comment

Jeffwan commented Jul 5, 2022

akanso commented Jul 5, 2022

DmitriGekhtman commented Jul 5, 2022

DmitriGekhtman commented Jul 5, 2022 • edited Loading

DmitriGekhtman commented Jul 5, 2022

DmitriGekhtman commented Jul 5, 2022

DmitriGekhtman commented Jul 5, 2022

DmitriGekhtman commented Jul 3, 2022 •

edited

Loading

DmitriGekhtman Jul 4, 2022 •

edited

Loading

DmitriGekhtman Jul 5, 2022 •

edited

Loading

Jeffwan Jul 5, 2022 •

edited

Loading

DmitriGekhtman commented Jul 5, 2022 •

edited

Loading