Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APIGW Normalize Status Conditions #16994

Merged
merged 28 commits into from
Apr 24, 2023
Merged

Conversation

jm96441n
Copy link
Member

@jm96441n jm96441n commented Apr 13, 2023

Description

We want to normalize status conditions for gateways and routes to more closely match the k8s spec. This PR also includes validations for combinations of status/type/reason. We deviate from the spec in a few places on routes and this is called out in the code.

Testing & Reproduction steps

Run the tests

Links

k8s spec
k8s shared types

PR Checklist

  • updated test coverage
  • external facing docs updated
  • appropriate backport labels added
  • not a security concern

@jm96441n jm96441n added pr/no-changelog PR does not need a corresponding .changelog entry pr/no-docs PR does not include docs and should not trigger reminder for cherrypicking them. backport/1.15 This release series is no longer active on CE. Use backport/ent/1.15. labels Apr 13, 2023
@@ -840,151 +829,132 @@ func newGatewayMeta(gateway *structs.APIGatewayConfigEntry, bound structs.Config
}).initialize()
}

// gatewayConditionGenerator is a simple struct used for isolating
// the status conditions that we generate for our components
type gatewayConditionGenerator struct {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this struct was no longer needed because we encapsulated creating the conditions in the structs package along with validations

if err := checkRouteConditionReason(name, status, reason); err != nil {
// note we panic here because an invalid combination is a programmer error
// this should never actually be hit
panic(err)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we panic here and in the gateway constructor because an error here is 100% programmer error

}
}

func TestNewGatewayInvalidCombinationsCausePanic(t *testing.T) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debating making this more of a property based test and randomizing the status/reason/type used but figured that was a bit overkill for this set of tests

@jm96441n jm96441n marked this pull request as ready for review April 13, 2023 17:54
now: pointerTo(time.Now().UTC()),
}
// gatewayAccepted marks the APIGateway as valid.
func gatewayAccepted() structs.Condition {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I reorganized the order of these methods to group all the condition types together, I also debated moving these into the config_entry_status file since they're pretty much just using all types/values from that package, curious if anyone has any opinions on it

Message: "gateway references invalid certificates",
LastTransitionTime: g.now,
}
func invalidCertificates() structs.Condition {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now this and invalidCertificate fail for the same Condition/Reason but have different messages

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so @andrewstucki this is now the only one (along with invalidCertificate ) to have a different type/reason (so ResolvedRefs type and InvalidCertifcateRef for the reason vs Accepted type and InvalidCertificate for the reason). Do you have an opinion on whether this makes sense or should be reverted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'm thinking that it makes sense to move invalidCertificate to a ResolvedRefs status along with keeping it scoped to an individual listener. I'd be ok potentially dropping this entire status, but maybe in a separate PR after discussing a bit more. Specifically here's the issue:

  1. I believe we want Accepted to be a rolled up condition as to whether a Gateway/Route is invalid or not
  2. We actually have a check that a Gateway has Accepted == True in order for it to actually bind any routes (meaning changing this will change this behavior):
    // check to make sure we're not binding to an invalid gateway
    if !g.Gateway.Status.MatchesConditionStatus(gatewayAccepted()) {
    return false, fmt.Errorf("failed to bind route to gateway %s: gateway has not been accepted", g.Gateway.Name)
    }
    // check to make sure we're not binding to an invalid route
    status := route.GetStatus()
    if !status.MatchesConditionStatus(routeAccepted()) {
    return false, fmt.Errorf("failed to bind route to gateway %s: route has not been accepted", g.Gateway.Name)
    }

What this currently means is that if any of our certificate refs are invalid across any listener, we actually say that the Gateway can't bind a route, that's probably wrong and should probably change and be scoped to a given listener (which the method that this check is contained in actually has access to).

If we want to fix it, rather than mark the entire Gateway as not Accepted, we could potentially mark each Listener individually as Accepted and scope the above binding check to a listener. That said, that would potentially defeat the purpose of considering Accepted as a rolled-up status since I can't see how else an entire Gateway would ever be Accepted == False.

So, the TLDR; on my initial thoughts are:

keep this Accepted, switch invalidCertificate to ResolvedRefs, at least for this PR

but I'm definitely open to discussion.

@mikemorris any thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense to me, if anything we can make changes in a future PR once we get a better feel for how this should look

Copy link
Contributor

@andrewstucki andrewstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I like the better strong typing, but I think we need to clean up and have only a subset of these specified. We don't need to dupe the entire upstream spec because a number of these conditions/reasons are things that we shouldn't ever allow to be written in the first place.

agent/structs/config_entry_status.go Outdated Show resolved Hide resolved
agent/structs/config_entry_status.go Outdated Show resolved Hide resolved
agent/structs/config_entry_status.go Outdated Show resolved Hide resolved
agent/structs/config_entry_status.go Outdated Show resolved Hide resolved
agent/structs/config_entry_status.go Outdated Show resolved Hide resolved
@jm96441n jm96441n force-pushed the jm96441n/normalize-status-conditions branch from f1bfee0 to 519acc4 Compare April 18, 2023 17:55
@@ -14,6 +14,7 @@ require (
github.com/hashicorp/serf v0.10.1
github.com/mitchellh/mapstructure v1.4.1
github.com/stretchr/testify v1.7.0
golang.org/x/exp v0.0.0-20230321023759-10a507213a29
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so running go mod tidy here to get exp slices package caused a cascading need of changes to other go.mod files that import this module which is why this change seems larger than it is

Copy link
Contributor

@andrewstucki andrewstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the changes earlier and I'm glad we have some better type-safety here now!

@jm96441n jm96441n merged commit e47f321 into main Apr 24, 2023
@jm96441n jm96441n deleted the jm96441n/normalize-status-conditions branch April 24, 2023 20:22
@nathancoleman nathancoleman added backport/1.15 This release series is no longer active on CE. Use backport/ent/1.15. and removed backport/1.15 This release series is no longer active on CE. Use backport/ent/1.15. labels Jun 22, 2023
nathancoleman pushed a commit that referenced this pull request Jun 23, 2023
* normalize status conditions for gateways and routes

* Added tests for checking condition status and panic conditions for
validating combinations, added dummy code for fsm store

* get rid of unneeded gateway condition generator struct

* Remove unused file

* run go mod tidy

* Update tests, add conflicted gateway status

* put back removed status for test

* Fix linting violation, remove custom conflicted status

* Update fsm commands oss

* Fix incorrect combination of type/condition/status

* cleaning up from PR review

* Change "invalidCertificate" to be of accepted status

* Move status condition enums into api package

* Update gateways controller and generated code

* Update conditions in fsm oss tests

* run go mod tidy on consul-container module to fix linting

* Fix type for gateway endpoint test

* go mod tidy from changes to api

* go mod tidy on troubleshoot

* Fix route conflicted reason

* fix route conflict reason rename

* Fix text for gateway conflicted status

* Add valid certificate ref condition setting

* Revert change to resolved refs to be handled in future PR
nathancoleman added a commit that referenced this pull request Jun 23, 2023
…7844)

* APIGW Normalize Status Conditions (#16994)

* normalize status conditions for gateways and routes

* Added tests for checking condition status and panic conditions for
validating combinations, added dummy code for fsm store

* get rid of unneeded gateway condition generator struct

* Remove unused file

* run go mod tidy

* Update tests, add conflicted gateway status

* put back removed status for test

* Fix linting violation, remove custom conflicted status

* Update fsm commands oss

* Fix incorrect combination of type/condition/status

* cleaning up from PR review

* Change "invalidCertificate" to be of accepted status

* Move status condition enums into api package

* Update gateways controller and generated code

* Update conditions in fsm oss tests

* run go mod tidy on consul-container module to fix linting

* Fix type for gateway endpoint test

* go mod tidy from changes to api

* go mod tidy on troubleshoot

* Fix route conflicted reason

* fix route conflict reason rename

* Fix text for gateway conflicted status

* Add valid certificate ref condition setting

* Revert change to resolved refs to be handled in future PR

* Resolve sneaky merge conflicts

---------

Co-authored-by: John Maguire <[email protected]>
Co-authored-by: Nathan Coleman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.15 This release series is no longer active on CE. Use backport/ent/1.15. pr/no-changelog PR does not need a corresponding .changelog entry pr/no-docs PR does not include docs and should not trigger reminder for cherrypicking them.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants