-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[autoscaler] Add support for EC2 launch templates. #17236
Conversation
Opening this PR with required changes for EC2 launch template support and an example autoscaler config for review. Will update with unit tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fairly good (and quite a reasonably sized PR :) )!
def _configure_from_launch_template(config): | ||
for node_type in config["available_node_types"].values(): | ||
config = _configure_node_type_from_launch_template(config, node_type) | ||
return config | ||
|
||
|
||
def _configure_node_type_from_launch_template(config, node_type): | ||
node_cfg = node_type["node_config"] | ||
if "LaunchTemplate" not in node_cfg: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add docstrings & type hints for these functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added docstrings and type hints for both this PR and the related changes in #14080. The new unit test also exercises the E2E happy-path for these changes in config.py and node_provider.py.
template_version = kwargs.pop("Version", "$Default") | ||
kwargs["Versions"] = [template_version] if template_version else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be setdefault? (as in kwargs.setdefault("Versions", ...)
). I'm wondering if we should avoid adding a field Version
(no s) in the LaunchTemplate
dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! My thinking here is that we should keep the launch template autoscaler config model and behavior consistent with the boto3 ec2.create_instances
API (https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#EC2.ServiceResource.create_instances) that this LaunchTemplate
is ultimately forwarded to inside of node_provider.py:
The ec2.create_instances
input launch template model is:
LaunchTemplate={
'LaunchTemplateId': 'string',
'LaunchTemplateName': 'string',
'Version': 'string'
}
Since this API only takes a singular Version
argument, and since we in turn require only a single input version to describe for correct E2E behavior, I wanted to keep our internal dependency on the Versions
argument used by ec2.describe_launch_template_versions
hidden from the end-user writing the autoscaler config. This also gives clients one less opportunity to shoot themselves in the foot by trying to specify more than 1 launch template version in their config.
However, one usability enhancement I caught on this line yesterday is that kwargs.pop("Version", "$Default")
should be changed to str(kwargs.pop("Version", "$Default"))
so that users can simply specify an integer version number like Version: 2
in their autoscaler config YAML (as-written, they would be forced to specify integer strings like Version: "2"
). 🙂
def _configure_from_launch_template(config: Dict[str, Any]) -> Dict[str, Any]: | ||
""" | ||
Merges any launch template data referenced by the node config of all | ||
available node type's into their parent node config. Any parameters | ||
specified in node config override the same parameters in the launch | ||
template, in compliance with the behavior of the ec2.create_instances | ||
API. The config to bootstrap is modified in place. | ||
|
||
Args: | ||
config (Dict[str, Any]): config to bootstrap | ||
Returns: | ||
config (Dict[str, Any]): The input config with all launch template | ||
data merged into the node config of all available node types. If no | ||
launch template data is found, then the config is returned | ||
unchanged. | ||
Raises: | ||
ValueError: If no launch template is found for any launch | ||
template [name|id] and version, or more than one launch template is | ||
found. | ||
""" | ||
|
||
# iterate over sorted node types to support deterministic unit test stubs | ||
for _, node_type in sorted(config["available_node_types"].items()): | ||
config = _configure_node_type_from_launch_template(config, node_type) | ||
return config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have this not modify the config in place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Note that this also required patching a couple areas that depend on bootstrap_aws
to invalidate their assumption of in-place config modification. I've also introduced a copy.deepcopy(config)
line at the top of bootstrap_aws
to both prevent in-place modification of the input config to bootstrap by any existing code path, and to make this behavior more immediately clear to readers.
Cleaning up in-place modification of each individual section of the copied config (e.g. provider config, node type config, node config, etc.) seems like a good thing to target across subsequent PRs.
def _configure_node_type_from_launch_template( | ||
config: Dict[str, Any], node_type: Dict[str, Any]) -> Dict[str, Any]: | ||
""" | ||
Merges any launch template data referenced by the given node type's | ||
node config into the parent node config. Any parameters specified in | ||
node config override the same parameters in the launch template. The | ||
config to bootstrap is modified in place. | ||
|
||
Note that this merge is simply a bidirectional dictionary update, from | ||
the node config to the launch template data, and from the launch | ||
template data to the node config. Thus, the final result captures the | ||
relative complement of launch template data with respect to node config, | ||
and allows all subsequent config bootstrapping code paths to act as | ||
if the complement was explicitly specified in the user's node config. A | ||
deep merge of nested elements like tag specifications isn't required | ||
here, since the AWSNodeProvider's ec2.create_instances call will do this | ||
for us after it fetches the referenced launch template data. | ||
|
||
Args: | ||
config (Dict[str, Any]): config to bootstrap | ||
node_type (Dict[str, Any]): node type config to bootstrap | ||
Returns: | ||
config (Dict[str, Any]): The input config with all launch template | ||
data merged into the node config of the input node type. If no | ||
launch template data is found, then the config is returned | ||
unchanged. | ||
Raises: | ||
ValueError: If no launch template is found for the given launch | ||
template [name|id] and version, or more than one launch template is | ||
found. | ||
""" | ||
node_cfg = node_type["node_config"] | ||
if "LaunchTemplate" not in node_cfg: | ||
return config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the modification of node_type
not in-place and return the modified node_type
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
""" | ||
Merges user-provided node config tag specifications into a base | ||
list of node provider tag specifications. The base list of | ||
node provider tag specs is modified in-place. | ||
|
||
This allows users to add tags and override values of existing | ||
tags with their own, and only applies to the resource type | ||
"instance". All other resource types are appended to the list of | ||
tag specs. | ||
|
||
Args: | ||
tag_specs (List[Dict[str, Any]]): base node provider tag specs | ||
user_tag_specs (List[Dict[str, Any]]): user's node config tag specs | ||
""" | ||
|
||
for user_tag_spec in user_tag_specs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this a separate function!
Modification in place seems fine here :)
""" | ||
Applies default updates made by AWSNodeProvider to node_cfg during node | ||
creation. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you mention that this is only used in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the only outstanding thing to change here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Windows Failure is not related ( |
These changes add support for EC2 Launch Templates as part of a user's AWS Autoscaler node config. Any parameters specified in node_config override the same parameters in the launch template, in compliance with the behavior of EC2's create_instances API. These changes were previously proposed in: #9336.
Related issue number
Closes #9334
Resolves milestone [8] of #8420.
Checks
scripts/format.sh
to lint the changes in this PR.