Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Track user-configured options in Serve deployments #28313

Merged
merged 43 commits into from
Sep 29, 2022

Conversation

shrekris-anyscale
Copy link
Contributor

@shrekris-anyscale shrekris-anyscale commented Sep 6, 2022

Signed-off-by: Shreyas Krishnaswamy [email protected]

Why are these changes needed?

Background
Serve sets deployment config defaults directly via keyword arguments in the @serve.deployment decorator. This makes it it makes it impossible to track whether the config options were set by default or by the user. Distinguishing between these would let serve build print only the user-configured options, instead of all options (including defaults).

Changes
This change adds a layer of indirection in the @serve.deployment decorator. It sets all the decorator config options to an enum element, DEFAULT.VALUE, by default. This element indicates that the config setting was set by default– not by the user. The Serve DeploymentSchema is also updated to use DEFAULT.VALUE as default.

This change also introduces a new Set called user_configured_option_names to the DeploymentConfig, which tracks the group of options configured by the user. When Serve is schematized or serialized, it can use the names in this set to indicate which options are default and which ones are user-configured.

serve build now prints only user-configured values (as well as deployment names) to the Serve config file.

None is no longer the default value for deployment options. The only options that can be set to None (or null in the Serve config) are:

  • route_prefix: None means the deployment should not be exposed over HTTP.
  • num_replicas: None is allowed if an autoscaling_config is provided.
  • autoscaling_config: None is allowed if a num_replicas is provided.
  • user_config: None indicates that there's no user_config to pass into reconfigure.

Follow-up Changes

  • serve build should offer a flag that lets users print all options, including the defaults. This is especially useful for users that want to explicitly track all options settings. It may also be useful to always print the route_prefixes for Serve deployments if they're not None to easily see which deployment is the driver.
  • Plumb the Serve codebase to actually make certain options non-optional (e.g. ray_actor_options). This change only modifies the thin wrappers over the Deployment object (i.e. the @serve.deployment decorator and the Serve schemas) but not the Deployment object itself.

Related issue number

N/A

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
@shrekris-anyscale shrekris-anyscale changed the title [WIP] [Serve] Use DEFAULT.VALUE as default config values for deployments [WIP] [Serve] Track user-configured options in Serve deployments Sep 13, 2022
@shrekris-anyscale shrekris-anyscale changed the title [WIP] [Serve] Track user-configured options in Serve deployments [Serve] Track user-configured options in Serve deployments Sep 14, 2022
Signed-off-by: Shreyas Krishnaswamy <[email protected]>
init_kwargs: Default[Dict[Any, Any]] = DEFAULT.VALUE,
route_prefix: Default[Union[str, None]] = DEFAULT.VALUE,
ray_actor_options: Default[Dict] = DEFAULT.VALUE,
user_config: Default[Optional[Any]] = DEFAULT.VALUE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct me if i am wrong, we are having DEFAULT.VALUE is 1, what does it mean for setting every attribute to 1 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not quite setting every attribute to 1 here. DEFAULT.VALUE is a Python enum element with an arbitrary value (in this case 1). It’s not an alias for 1. It could be set to any other arbitrary value (e.g. 2, “hello”, etc.) and the code here would still be the same.

I’m using DEFAULT.VALUE as the default because the user will never pass that in as a value for a deployment option (since it’s meaningless to the user and it would require them to import the enum from the Serve codebase). So in the decorator body, if any attributes are set to DEFAULT.VALUE, we know that the user didn’t set them. That lets us populate user_configured_options correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In [3]: from enum import Enum, auto

In [4]: class DEFAULT(Enum):
   ...:     VALUE = auto()
   ...:

In [5]: DEFAULT.VALUE
Out[5]: <DEFAULT.VALUE: 1>

In [6]: DEFAULT.VALUE == 1
Out[6]: False

In [7]: DEFAULT.VALUE == DEFAULT.VALUE
Out[7]: True

@@ -49,6 +49,14 @@ py_test(
deps = [":serve_lib"],
)

py_test(
name = "test_deployment",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more descriptive naming please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed it to test_deployment_class. Let me know if you have other suggestions. I'm trying to differentiate these tests, which are about the Deployment class and how it manipulates data, and the Deployment objects that actually run on the Ray cluster.

@@ -88,7 +88,7 @@ def default(self, obj):
if isinstance(obj, DeploymentSchema):
return {
DAGNODE_TYPE_KEY: "DeploymentSchema",
"schema": obj.dict(),
"schema": obj.dict(exclude_defaults=True),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment on why do we need this flag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I added a comment.

ignore_none: When True, any valid keywords with value None
are ignored, and their values stay default. Invalid keywords
still raise a TypeError.
ignore_default: When True, any valid keywords with value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this flag appear to be unused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch– I removed the argument from the docstring.

@@ -88,6 +88,8 @@ message DeploymentConfig {
AutoscalingConfig autoscaling_config = 10;

string version = 11;

repeated string user_configured_options = 12;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this repeated string type?

Copy link
Contributor Author

@shrekris-anyscale shrekris-anyscale Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the protobuf, user_configured_options is a list of strings. Each string is a deployment option that was manually tuned by the user.

I used repeated string because it seemed to best match a list of strings. Is there another datatype I should use?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each string is a deployment options that was manually tuned by the user.

Ahhh these are options keys. not the value. can you change the variable name so it implies that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I renamed user_configured_options to user_configured_option_names.

@@ -88,7 +88,10 @@ def default(self, obj):
if isinstance(obj, DeploymentSchema):
return {
DAGNODE_TYPE_KEY: "DeploymentSchema",
"schema": obj.dict(),
# The schema's default values are Python enums that aren't
# JSON-serializable. exclude_defaults omits these, so the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# JSON-serializable. exclude_defaults omits these, so the
# JSON-serializable by design. exclude_defaults omits these, so the

Comment on lines 325 to 330
_internal: If True, this function:
1. Won't log deprecation warnings
2. Won't update this deployment's config's
user_configured_options.
Should only be True when used internally by Serve.
Should be False when called by users.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make these complete sentences in one section please.

Comment on lines +46 to +48
# Type alias: objects that can be DEFAULT.VALUE have type Default[T]
T = TypeVar("T")
Default = Union[DEFAULT, T]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so cool, learned something new :D.

from typing import TypeVar, Union
from enum import Enum, auto

class DEFAULT(Enum):
    VALUE = auto()

T = TypeVar("T")
Default = Union[DEFAULT, T]

def func(name: Default[str] = DEFAULT.VALUE):
    print(name)

func()
func("a")
func(1)
$ mypy app.py
app.py:15: error: Argument 1 to "func" has incompatible type "int"; expected "Union[DEFAULT, str]"
Found 1 error in 1 file (checked 1 source file)

Comment on lines 485 to 486
# Remove extraneous newline
config_str = config_str[:-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really necessary? if so, please use config.rstrip("\n") so it's more readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without that line, the file ends with two newlines. Ideally, it should end with a single newline.

I've replaced that snippet with:

# Ensure file ends with only one newline
config_str = config_str.rstrip("\n") + "\n"

@@ -88,6 +88,8 @@ message DeploymentConfig {
AutoscalingConfig autoscaling_config = 10;

string version = 11;

repeated string user_configured_options = 12;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each string is a deployment options that was manually tuned by the user.

Ahhh these are options keys. not the value. can you change the variable name so it implies that?

# Create list of all user-configured options from keyword args
user_configured_option_names = [
option
for option, value in locals().items()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a comment, we should not have any local variable because of using locals()? Or Deployment can return you a user-configured attributes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good idea. We can still have local variables, but they should be defined after this list. Creating that list should be the first thing that happens in the function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the order is difficult to guarantee and audit, better to have a way to predefine user-configured attributes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think unit tests should be able to catch any issues with the ordering since any local variables that are defined before this list will show up as a user-configured attribute and cause the unit tests to fail. What do you mean by predefine user-configured attributes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deployment can provide the user_configured_option_names attributes, you can filter the attributes from locals().

@shrekris-anyscale
Copy link
Contributor Author

The remaining test failures are unrelated:

Screen Shot 2022-09-29 at 3 31 30 PM

@simon-mo simon-mo merged commit 18b38c5 into ray-project:master Sep 29, 2022
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants