Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pubsys: added EC2 image validation #2983

Closed

Conversation

mjsterckx
Copy link
Contributor

Added a validate-ami subcommand to pubsys to validate EC2 images, given a config file with regions and paths to files containing image ids.

Description of changes:

Added a validate_ami mod to the pubsys crate. This mod adds a subcommand validate-ami with the following signature:

pubsys-validate-ami 0.1.0
Validates EC2 images

USAGE:
    pubsys --infra-config-path <infra-config-path> validate-ami [FLAGS] [OPTIONS] --validation-config-path <validation-config-path>

FLAGS:
        --json       If this argument is given, print the validation results summary as a JSON object instead of a
                     plaintext table
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
        --validation-config-path <validation-config-path>    File holding the validation configuration
        --write-results-path <write-results-path>
            Optional path where the validation results should be written

        --write-results-filter <write-results-filter>...
            Optional filter to only write validation results with these statuses to the above path The available
            statuses are: `Correct`, `Incorrect`, `Missing`
        --log-level <log-level>
            How much detail to log; from least to most: ERROR, WARN, INFO, DEBUG, TRACE [default: INFO]
  • validation-config-path is the path to the file containing the validation configuration. This file should look like this:
{
    "validation_regions": [ "us-west-2", "us-east-1" ],
    "expected_metadata_lists": [ "./canary/ami/ami_lists/1.11.0.json", "./canary/ami/ami_lists/1.11.1.json", "./canary/ami/ami_lists/1.12.0.json", "./canary/ami/ami_lists/latest.json" ]
}

Each expected_metadata_list should have the following structure:

{
  "us-west-2": {
    "ami-12345678": {
      "/aws/service/bottlerocket/aws-ecs-1-nvidia/arm64/1.12.0-6ef1139f/image_id": "ami-12345678",
      "/aws/service/bottlerocket/aws-ecs-1-nvidia/arm64/1.12.0-6ef1139f/image_version": "1.12.0-abcdefgh",
      "/aws/service/bottlerocket/aws-ecs-1-nvidia/arm64/1.12.0/image_id": "ami-12345678",
      "/aws/service/bottlerocket/aws-ecs-1-nvidia/arm64/1.12.0/image_version": "1.12.0-abcdefgh"
    },
    "ami-082d59d8979d777a6": {
      "/aws/service/bottlerocket/aws-ecs-1-nvidia/x86_64/1.12.0-6ef1139f/image_id": "ami-87654321",
      "/aws/service/bottlerocket/aws-ecs-1-nvidia/x86_64/1.12.0-6ef1139f/image_version": "1.12.0-hgfedcba",
      "/aws/service/bottlerocket/aws-ecs-1-nvidia/x86_64/1.12.0/image_id": "ami-87654321",
      "/aws/service/bottlerocket/aws-ecs-1-nvidia/x86_64/1.12.0/image_version": "1.12.0-hgfedcba"
    },`
    ...
  },
  "us-east-1": {
    ...
  }
}

The parameters inside the image-id objects are irrelevant, but this same file structure is used for SSM validation (#2969) so this allows both validations to use the same input files.

  • write-results-path is the path to the file where the validation results will be written. The file will look like this:
[
  {
    "image_id": "ami-0f4c91eb881453540",
    "expected_value": {
      "ImageId": "ami-0f4c91eb881453540",
      "Public": true,
      "EnaSupport": true,
      "SriovNetSupport": "simple"
    },
    "actual_value": {
      "ImageId": "ami-0f4c91eb881453540",
      "Public": true,
      "EnaSupport": true,
      "SriovNetSupport": "simple"
    },
    "region": "us-east-1",
    "status": "Correct"
  },
  ...
]
  • write-results-filter is a vec of potential statuses, which limits the validation results written to the above file. If the vec contains Correct and Incorrect, then only the validation results with those statuses will be written to the file and Missing validation results will not.

The command outputs a tabled summary of the validation results. This table will look like this (unless the --json flag is passed, in which case a JSON object will be printed):

+----------------+---------+-----------+---------+------------+
| String         | correct | incorrect | missing | accessible |
+----------------+---------+-----------+---------+------------+
| ap-northeast-3 | 448     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| eu-west-2      | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| ap-northeast-1 | 558     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| eu-west-1      | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| ca-central-1   | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| ap-southeast-1 | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| ap-south-1     | 553     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| us-east-1      | 558     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| sa-east-1      | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| us-east-2      | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| us-west-1      | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| eu-west-3      | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| ap-southeast-2 | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| eu-north-1     | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| us-west-2      | 558     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| eu-central-1   | 558     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+
| ap-northeast-2 | 548     | 0         | 0       | true       |
+----------------+---------+-----------+---------+------------+

The meaning of the different columns is this:

  • correct: the expected validated values of the images are equal to the values of the retrieved image
  • incorrect: the expected value of the parameter is different from the retrieved value
  • missing: the parameter was expected in that region but not retrieved
  • accessible: SSM parameters were successfully retrieved from that region. If an invalid region was given, this would say false and all other columns in that row would show -1

The validation will check the following 3 fields with their expected values:

  • Public: true
  • EnaSupport: true
  • SriovNetSupport: simple

Testing done:

  • Unit tests
  • Compiled a list of Bottlerocket image ids for each version in each validation region based on the public SSM parameters and used this as input for the command. Results are shown in the table above (each image validated correctly).

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@mjsterckx mjsterckx requested review from cbgbt and webern April 6, 2023 01:52
#[derive(Debug, Deserialize)]
pub(crate) struct ValidationConfig {
/// Vec of paths to JSON files containing expected metadata (image ids and SSM parameters)
expected_metadata_lists: Vec<PathBuf>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly confused. These are paths to additional configuration files? If so, be specific about whether absolute paths are required. Or, if relative paths are allowed, what are they relative to? i.e. would paths relative to this aggregating config file work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They can be relative paths, but they are relative to the pwd of the caller. I'll add that to the comments.

/// Structure of the validation configuration file
#[derive(Debug, Deserialize)]
pub(crate) struct ValidationConfig {
/// Vec of paths to JSON files containing expected metadata (image ids and SSM parameters)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment a paste error? Are metadata, image_ids and parameters the right nouns here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct, because the input files are the same as for the SSM parameter validation. The parameters don't necessarily have to be there, but the file should follow the structure of region->image_id->{}.

let mut images = HashMap::new();

// Send the request
let mut get_future = client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is applicable to the implementation in your previous PR as well. I can't quite tell from reading this if things are happening truly in parallel? Can you tell from your tests how the performance will scale to 30 regions with 100s of AMIs in each?

I think the key question is whether the send() function is returning a unordered collection of futures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requests are sent completely in parallel. These image requests are fast so the difference isn't really noticeable. For the SSM parameters, before they were parallelized, the total duration was over 3 minutes. After parallelizing, it took 1 minute (which is how long it takes to get the parameters from the region with the highest latency, even if that region is the only one to be queried).

If there were 30 regions with 100s of AMIs, the runtime would be the same as for a single region with the highest latency of those 30.

/// Represent the possible status of an EC2 image validation
#[derive(Debug, Eq, Hash, PartialEq, Serialize, Deserialize)]
pub(crate) enum AmiValidationResultStatus {
/// The expected value was equal to the actual value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of AMIs can we be more specific than value? In this case do we actually mean:

/// The AMI was found and is public

pub(crate) image_id: String,

/// The expected value of the image
pub(crate) expected_value: ImageDef,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This naming seems to be a holdover from SSM? Should this field be called expected_image_def? Should the next field be called actual_image_def?

// Determine the validation status based on equality, presence, and absence of expected and
// actual image values
let status = match (&expected_value, &actual_value) {
(expected_value, Some(actual_value)) if actual_value.eq(expected_value) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slightly more idiomatic to use the operator

Suggested change
(expected_value, Some(actual_value)) if actual_value.eq(expected_value) => {
(expected_value, Some(actual_value)) if actual_value. == expected_value => {

tools/pubsys/src/aws/validate_ami/results.rs Show resolved Hide resolved
tools/pubsys/src/aws/validate_ami/results.rs Show resolved Hide resolved
Added a `validate-ami` subcommand to pubsys to validate EC2 images,
given a config file with regions and paths to files containing image
ids.
@mjsterckx
Copy link
Contributor Author

^ Rebased after #2969 got merged.

@mjsterckx
Copy link
Contributor Author

Closing this PR to address @bcressey's comments on #2987. Will create a new PR with the change to how the subcommand takes its input.

@mjsterckx mjsterckx closed this Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants