Audit analytic to checks all bundles of a catalog #66

camilamacedo86 · 2021-02-23T19:42:58Z

This ep is a product of the design of an audit command that could be used by the PE team, external users, and internal product teams to compare their operators against various audit criteria.

The audit tool should be used to examine a given catalog's set of operators and produce a report that is readable by end-users but also be exported to json or other formats that can be processed.

Some current audit criteria ideas:

PDB api deprecation
v1beta1 CRD deprecation
annotations defining supportability characteristics such disconnected, multi-arch etc.
channel naming compliance
capability levels - maybe this could be also run within a scorecard test

The additional checks required to address all criteria not implemented in the validators is not part of this EP and has been proposed in other ones such as #65.

estroz

A lot of good stuff here conceptually. A few major details need discussion:

Much of this EP discusses the marriage of operator-sdk and opm functionality, but I don't see concrete dicsussion of how that will occur. Problems:
- opm is intended for catalog maintainers while operator-sdk is for individual operator authors who likely won't be touching a production catalog. This, amongst technical reasons, is why opm is the canonical location for index operations.
- Scorecard is currently an OSDK subproject, meaning it would need to be split into its own first-class project before opm could use it.
Running scorecard tests on every single bundle in a catalog, which can contain several hundred operators at different versions, would take a long time (hours), so failure modes and parallelism need to be discussed here.
There is currently no official documentation about bundling tests into production bundles, which is what this command would be reading.

I think (3) is the biggest of the above problems. I'd at least like to see an empirical analysis of how many bundles in an existing production catalog (the OCP 4.6 catalog for example) contain scorecard configs.

enhancements/audit-command.md

estroz · 2021-03-01T20:56:20Z

enhancements/audit-command.md

+- Extract the database from the image informed
+- Via sql get all bundles name and paths from the index catalog image
+- Download and extract all bundles files by using the operator bundle path
+- Get the required data for the report via sql from the index, from the operator bundle manifest files and from the project repository


What project repository? Each operator's? That doesn't sound in-scope.

It is described in the proposal. The operator bundle repositories as done indeed in the POC linked here. I will clarify it in this line as well.

Really thank you. clarified 👍

enhancements/audit-command.md

estroz · 2021-03-01T21:25:59Z

enhancements/audit-command.md

+
+### Risks and Mitigations
+
+A risk is the command take to long to be executed. In order to mitigate it, the flags `--wait-time` and customize the checks that should be done as the data that could be used was added: 


--wait-time should be configured on a per-bundle or per-test basis, since the caller would have to calculate wait time * bundles in catalog * number of tests to get a reasonable estimate for an overall wait time.

Based on that I changed the EP and added the following info:

Note that the tool will have a default timeout set up for the command and then, it can be changed by the flag `--wait-time for who is using it and has a better idea over what would be the expected reasonable time for the specific scenario.

The proposal has no intention to address a complexity logic to try to guess a default timeout based on the number of bundles and other possible combination aspects. Beside be possible to try to calculate that based on some logic such as wait time * bundles in catalog * number of tests it would bring complexities which show not required at least in the first moment since users can indeed set any value as please them. See that, it can change a lot since the user might be running the full report or not, might be using the checks or not, etc. So, to keep the simplicity the approach adopted is to reduce the effort required to put it in place.

Addressed in the EP as well. So, I am closing, however, please feel free to re-open if you wish.

I disagree that having an overall timeout is useful, unless you set it to 10 hours or something ridiculous.

For the alpha version, I do not think that we need or should to address this kind of complexity. I'd prefer the timeout as 2 hours by default instead (educated guess). We are allowing users to change it as please them which shows enough.

enhancements/audit-command.md

bparees

before we invest too much energy in this wrapper/helper command, i'd like to see us put our energy into finalizing the actual new validation logic we want to add to the sdk.

open questions on validators for the sdk include:

where will new validation code (such as specific for OCP requirements) live?
will we define a pluggable model for adding validators to the sdk?
do we have a finalized list of the first pass set of new validation rules/requirements for ocp + operatorhub?

Anyone can write some scripts to pull down a catalog, extract the bundle info, and run validation against each bundle, once the validation exists, so i do not see the audit command as being quite as urgent. Is it useful? sure. But i don't want to see us get too distracted by it until we do the prerequisite work.

enhancements/audit-command.md

jmrodri

Here are my comments for my first pass through. I've got more to finish reading.

enhancements/audit-command.md

bparees · 2021-03-04T02:32:55Z

Also, in his comments, he is raising questions about another requirement that has no relation with this one.

i don't think it's 100% true that they have no relation. The proposal to add new (pluggable?) validation commands to the sdk so that we can have OCP version specific validation, will absolutely have an impact on how this command, which will wrapper that one, gets implemented. (And to be clear, i'm referring to openshift/enhancements#661 which, while technically a downstream EP, may lead to new upstream logic being added to support it).

My fundamental example of "until the validation commands are more fully fleshed out, we can't know what it will take to wrap them" is validation logic like this which is proposed in #65 (and similar things that would be needed in the downstream ocp validation logic):

To check if the operator bundle is using the Webhook API version which is no longer supported | error | If `spec.minKubeVersion` is  informed  and it is >= `1.22` then, check if the CSV has the spec `webhookdefinitions.admissionReviewVersions.v1beta1`. (See [here](https://github.com/operator-framework/operator-sdk/blob/v1.4.2/testdata/go/v2/memcached-operator/bundle/manifests/memcached-operator.clusterserviceversion.yaml#L203-L205). Note that in these cases, users should be informed that the `admissionregistration.k8s.io/v1beta1` was deprecated in Kubernetes `1.16` and it is removed in `1.22`.

I think if we are going to be adding validation logic that depends on the kubeversion (which i agree is useful), we also need to add a way for the invoker of the command to specify the kubeversion they want to validate against. Not all operators specify minKubeVersion. And in some cases it may be desirable to set an alternate kubeversion anyway. This gets even more common (and complicated since now we have concepts of min + max versions) when downstream ocp validation logic exists and we want to wrap that into a catalog audit command.

So what will that mean for alpha audit --validation true? Do we end up adding a version parameter there? Declare it's not overrideable? Something else?

And what if i want to run the optional validation logic? Are we going to wire through the --select-optional flag that #65 relies on? Will this command be able to run the downstream validation logic that openshift/enhancements#661 introduces?

These sorts of experience questions, and the general fact that I think it's more urgent for us to deliver on the operatorhub+ocp validation improvements are why I am trying to avoid people investing energy/effort here at this particular moment. The answers to these questions will be more obvious once we get further with the validation extensions.

I'll also reiterate what @dmesser said on the arch call and which I agreed with at the time: The really novel thing that can be added here is true "catalog-level" (or at least package-level) validation which looks at all the bundles of a package holistically and attempts to detect potential error conditions. That is something the sdk cannot do. So to the extent that we have an urgent need to fill a gap, that is the gap.

I realize that if we're going to build a tool that closes that gap, and does some of the other reporting that this EP proposes, also including logic that wraps+invokes sdk validation tools is relatively trivial (the UX/cli argument questions i posed above not withstanding) but that(holistic package level consistency validation) doesn't seem to be the main thrust of the proposal right now.

And fundamentally what i'm trying to raise is a prioritization question, not a "does this have technical value" question. We have limited resources+attention span. I would prefer to see us spend that on adding the validation logic (both the upstream+downstream validation logic) over this, at this current point in time.

camilamacedo86 · 2021-03-04T02:50:34Z

Hi @bparees,

Really thank you for your inputs. And I got them. I need some time to see how we can address all these concerns and feedbacks. I will ping you be sure when it be in that stage. To clarifies that I am flagging this proposal as WIP.

enhancements/audit-command.md

camilamacedo86 · 2021-03-10T13:18:33Z

Hi @jmccormick2001, @estroz, @jmrodri, @bparees, @bentito

It is updated trying to cover all questions and points raised as well. it might good enough for a second round.
I'd suggested we prioritize the reviews on #65 and openshift/enhancements#661. But is important it be also updated for we are able to check the relations.

I am removing the blocker to get this merged since it does not affects SDK and it is also clarified in the EP.

camilamacedo86 · 2021-03-20T21:08:36Z

Hi @jmccormick2001, @bparees,

This proposal is updated really thank you for your suggestions and feedbacks.
Note that the alpha version of the audit will NOT cover the downstream checks (OPC). In this way, in POV the definition of openshift/enhancements#661 is no longer a blocker for it gets merged.

The idea here is we are able to have EP for we start to work in alpha solution and it actually seems like too be covering more details and points that would be mandatory for its first version.

Also, it's updated now describing that the first version will probably be done in its own repo. This proposal does not affect SDK, OPM as any other solution in the org which also is clarified in the EP. Only tries to use what already exists to obtain the required data.

The implementation of the EP https://github.com/operator-framework/enhancements/blob/master/enhancements/operator-hub-check.md has more priority and will be done first as discussed 👍 .

However, I would like to kindly ask for your help here for we are able to get this EP merged. After the above implementation I will probably be working on a reduced version of it ( like a POC to move forward with this idea and we have a getting start point).

In advanced, thank your time and all help and support.

c/c @gallettilance @bentito

bparees

i understand you just want agreement to start what amounts to a PoC here. I think there's enough detail to do that. but i'll definitely want to see more discussion about some of these implementation details, code ownership(sdk vs opm vs something else), interfaces to external tools, and config args before we'd consider promoting something like this to anything more than alpha.

bparees · 2021-03-22T02:23:00Z

enhancements/audit-command.md

+
+9. Will audit command works with downstream and upstream specific checks?
+
+To address specific downstream concerns and checks are out of the scope of this proposal. 


it's ok to defer this question for now since you're also not proposing a permanent home (opm, sdk, something else) for the implementation of this tool. But before we pick a final home for it, we will need to revisit this question.

The idea so far is the audit tool have its own repo. After we implement that if we decide that it should be moved to any other place than by sure we can revisit it. See that was clarified in the first open question.

enhancements/audit-command.md

bparees · 2021-03-22T02:46:15Z

enhancements/audit-command.md

+  --list-optional-validators List all optional validators available. When set, no report will be execute 
+  `--optional-values`          string map with key and values which will be informed to the validators e.g`"k8s=1.21"` 
+  --checks-only                     If set, disable the data report. It is useful for who would like to audit the index only runnning the checks and validations. 
+  --head-only                If set, audit only the operator bundles which are the head of the channel. (default all bundles of the index catalog image will be used)


how about a --filter flag that takes a regex for selecting which packages or csvnames to check?

Added. See:

To address this requirement a user will be able to use the flag --filter where the tool will only get the operator bundles where the package name is like %name%.

Why not accepted regex?

See that some personas who would like to use it might not be developers then, I am trying here to make it simple.

See that some personas who would like to use it might not be developers then, I am trying here to make it simple.

regex is still the right answer here. Regex works very well even if you do not know regex because if you just pass a fixed string like foo it will still match *foo*. But it leaves the power of regex available for someone who knows it and needs its flexibility.

I added the regex option. However, for a POC/alpha version that seems fine, we use simple SQL. Note that the regex might be so simple to be used with I am not sure if it is supportable for the stack used. Then, that requires a little more effort to be accomplished.

enhancements/audit-command.md

bparees · 2021-03-22T02:58:53Z

enhancements/audit-command.md

+
+By using this flag, the audit command will filter the operator bundles in the first step to download and extract just this ones. 
+
+To know the operator bundles which are the latest head we will get only the entries which has values in the `operatorbundle` table for both `csv` and `bundle` column. 


elsewhere you've tried to indicate loose coupling to the index format, but this expresses a very tight coupling that will break once the index moves away from sqllite.

I think you need to identify channel heads the same way opm does, which i believe is based on semver sorting of the version value, with +buildid being a tiebreaker on otherwise identical semvers?

I added the info.

enhancements/audit-command.md

camilamacedo86 · 2021-04-05T21:46:00Z

/hold

I have been working on its POC and I will probably perform changes here soon. See: https://github.com/camilamacedo86/audit

bparees · 2021-04-22T03:21:08Z

/approve

openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 23, 2021

camilamacedo86 changed the title ~~WIP - add audit command to opm~~ Add audit command to audit all bundles of an catalog Mar 1, 2021

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 1, 2021

camilamacedo86 assigned exdx and harishsurf Mar 1, 2021

camilamacedo86 requested review from anik120, bparees, gallettilance, jmccormick2001 and kevinrizza March 1, 2021 17:42

estroz previously requested changes Mar 1, 2021

View reviewed changes

This comment has been minimized.

Sign in to view

bparees reviewed Mar 2, 2021

View reviewed changes

enhancements/audit-command.md Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

camilamacedo86 requested review from estroz and bparees March 3, 2021 00:27

camilamacedo86 changed the title ~~Add audit command to audit all bundles of an catalog~~ audit command to audit all bundles of a catalog Mar 3, 2021

jmccormick2001 suggested changes Mar 3, 2021

View reviewed changes

enhancements/audit-command.md Outdated Show resolved Hide resolved

enhancements/audit-command.md Outdated Show resolved Hide resolved

enhancements/audit-command.md Outdated Show resolved Hide resolved

enhancements/audit-command.md Show resolved Hide resolved

jmrodri reviewed Mar 3, 2021

View reviewed changes

enhancements/audit-command.md Outdated Show resolved Hide resolved

jmrodri reviewed Mar 3, 2021

View reviewed changes

enhancements/audit-command.md Outdated Show resolved Hide resolved

enhancements/audit-command.md Outdated Show resolved Hide resolved

jmrodri reviewed Mar 3, 2021

View reviewed changes

This comment has been minimized.

Sign in to view

camilamacedo86 changed the title ~~audit command to audit all bundles of a catalog~~ WIP: audit command to audit all bundles of a catalog Mar 4, 2021

openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 4, 2021

camilamacedo86 changed the title ~~WIP: audit command to audit all bundles of a catalog~~ WIP: audit analytic to checks all bundles of a catalog Mar 4, 2021

camilamacedo86 commented Mar 8, 2021

View reviewed changes

enhancements/audit-command.md Outdated Show resolved Hide resolved

camilamacedo86 requested review from jmccormick2001 and jmrodri March 8, 2021 00:17

This comment has been minimized.

Sign in to view

camilamacedo86 commented Mar 8, 2021

View reviewed changes

enhancements/audit-command.md Outdated Show resolved Hide resolved

rashmigottipati reviewed Mar 8, 2021

View reviewed changes

enhancements/audit-command.md Outdated Show resolved Hide resolved

camilamacedo86 changed the title ~~WIP: audit analytic to checks all bundles of a catalog~~ Audit analytic to checks all bundles of a catalog Mar 10, 2021

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 10, 2021

camilamacedo86 requested review from rashmigottipati and bentito March 10, 2021 13:15

bparees reviewed Mar 22, 2021

View reviewed changes

jmccormick2001 approved these changes Mar 22, 2021

View reviewed changes

enhancements/audit-command.md Show resolved Hide resolved

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 5, 2021

add audit command

67eb06a

camilamacedo86 merged commit 5023fab into operator-framework:master Apr 22, 2021

camilamacedo86 deleted the audit-command branch April 22, 2021 06:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audit analytic to checks all bundles of a catalog #66

Audit analytic to checks all bundles of a catalog #66

camilamacedo86 commented Feb 23, 2021 •

edited

Loading

estroz left a comment

estroz Mar 1, 2021

camilamacedo86 Mar 2, 2021

camilamacedo86 Mar 2, 2021

estroz Mar 1, 2021

camilamacedo86 Mar 2, 2021 •

edited

Loading

camilamacedo86 Mar 2, 2021

estroz Mar 3, 2021

camilamacedo86 Mar 7, 2021 •

edited

Loading

This comment has been minimized.

bparees left a comment

This comment has been minimized.

jmrodri left a comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

bparees commented Mar 4, 2021 •

edited

Loading

camilamacedo86 commented Mar 4, 2021

This comment has been minimized.

camilamacedo86 commented Mar 10, 2021

camilamacedo86 commented Mar 20, 2021 •

edited

Loading

bparees left a comment

bparees Mar 22, 2021

camilamacedo86 Mar 22, 2021 •

edited

Loading

bparees Mar 22, 2021

camilamacedo86 Mar 22, 2021

bparees Mar 22, 2021

camilamacedo86 Apr 21, 2021

bparees Mar 22, 2021

camilamacedo86 Mar 22, 2021

camilamacedo86 commented Apr 5, 2021

bparees commented Apr 22, 2021


		### Risks and Mitigations

		A risk is the command take to long to be executed. In order to mitigate it, the flags `--wait-time` and customize the checks that should be done as the data that could be used was added:


		9. Will audit command works with downstream and upstream specific checks?

		To address specific downstream concerns and checks are out of the scope of this proposal.


		By using this flag, the audit command will filter the operator bundles in the first step to download and extract just this ones.

		To know the operator bundles which are the latest head we will get only the entries which has values in the `operatorbundle` table for both `csv` and `bundle` column.

Audit analytic to checks all bundles of a catalog #66

Audit analytic to checks all bundles of a catalog #66

Conversation

camilamacedo86 commented Feb 23, 2021 • edited Loading

estroz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

camilamacedo86 Mar 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

camilamacedo86 Mar 7, 2021 • edited Loading

Choose a reason for hiding this comment

This comment has been minimized.

bparees left a comment

Choose a reason for hiding this comment

This comment has been minimized.

jmrodri left a comment

Choose a reason for hiding this comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

bparees commented Mar 4, 2021 • edited Loading

camilamacedo86 commented Mar 4, 2021

This comment has been minimized.

camilamacedo86 commented Mar 10, 2021

camilamacedo86 commented Mar 20, 2021 • edited Loading

bparees left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

camilamacedo86 Mar 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

camilamacedo86 commented Apr 5, 2021

bparees commented Apr 22, 2021

camilamacedo86 commented Feb 23, 2021 •

edited

Loading

camilamacedo86 Mar 2, 2021 •

edited

Loading

camilamacedo86 Mar 7, 2021 •

edited

Loading

bparees commented Mar 4, 2021 •

edited

Loading

camilamacedo86 commented Mar 20, 2021 •

edited

Loading

camilamacedo86 Mar 22, 2021 •

edited

Loading