Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update AddressType definition to add domain-prefixed strings as an option #1178

Merged
merged 3 commits into from
Jun 7, 2022

Conversation

youngnick
Copy link
Contributor

What type of PR is this?
/kind cleanup
/kind api-change

What this PR does / why we need it:
This PR changes the AddressType type to add validation on the type itself, rather than the spec.Gateway.addresses list where it's used, and updates that validation to allow domain-prefixed strings (example.com/CustomAddressType or similar).

This is a validation change, so it's a breaking API change, technically. However, I've made sure that the existing valid value "NamedAddress" is acceptable, it's just moved to Custom support (which it always was anyway), which is why #958 is blocking v0.5.0.

The intent here is to give implementations an option to extend this field safely and compatibly, and then bring other options back for Extended standardization if need be.

#958 talks about having a small GEP, but once I started working on it, I realized that the GEP would be almost nothing but the actual code change (which is itself quite small), so I thought that opening a PR would be quicker than doing the whole GEP process. Happy to take it to a GEP if anyone feels I'm wrong here though.

Which issue(s) this PR fixes:

Fixes #958

Does this PR introduce a user-facing change?:

The "NamedAddress" value for Gateway's spec.addresses[].type field has been deprecated, and support for domain-prefixed values (like example.com/NamedAddress) has been added instead, to better represent the custom nature of this support.

@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 1, 2022
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 1, 2022
@@ -450,7 +450,6 @@ type GatewayAddress struct {
// Type of the address.
//
// +optional
// +kubebuilder:validation:Enum=IPAddress;Hostname;NamedAddress
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this validation because it's moved to the underlying type.

Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @youngnick! This mostly LGTM, but I think I'd rather make the validation and spec a bit more strict.

//
// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:MaxLength=253
// +kubebuilder:validation:Pattern=`^([a-zA-Z0-9])+$|^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])\/[a-zA-Z0-9]+$`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want a more limited regex here, maybe something like this:

^IPAddress$|^Hostname$|^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])\/[a-zA-Z0-9]+$

https://regex101.com/r/xr7aYg/1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reluctant to do this, because it means that adding another constant will mean a validation change, which is technically a breaking API change. I'd rather have this validation here, with the constants, and add additional validation in the webhook if required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed an update that shows what I mean. I think that doing things the way I have here will allow us to add more constants more easily later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely get the hesitancy to make a validation change, but I think the alternative would result in the following:

  • Lots of non domain-prefixed names being used without being defined centrally here
  • Lots of typos when writing out standard names (hostname instead of Hostname, or one of IPaddress or IPAdress instead of IPAddress)

I think the argument against stricter validation here could also be used to remove all uses of our enum validation throughout the API because we want to leave room for additional values in the future. I think this is the most relevant part of the API convention guidance:

If an API drives behavior that is implemented by external clients (like Ingress or NetworkPolicy), the enum field must explicitly indicate that additional values may be allowed in the future, and define how unrecognized values must be handled by clients. If this was not done in the first release containing the enum field, it is not safe to add new values that can break existing clients.

I think that matches what we're already doing with our other enum fields throughout the API. I'd personally prefer to just give that notice in advance and start with stricter validation here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in our community meeting today. This seems to have the best possible balance. All the same validation, but leave the validation of the specific constants to the webhook so adding a constant value is easier in the future. I think my only nit here would be to explicitly call out in the godocs here that we may add additional supported values in the future.

apis/v1alpha2/shared_types.go Outdated Show resolved Hide resolved
//
// Values `IPAddress` and `Hostname` have Extended support.
//
// All other values, including domain-prefixed values have Custom support,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// All other values, including domain-prefixed values have Custom support,
// Domain-prefixed values have Custom support,

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 6, 2022
Copy link
Member

@shaneutt shaneutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just a couple of small comments for your considerations 👍

apis/v1alpha2/shared_types.go Outdated Show resolved Hide resolved
Comment on lines 101 to 124

// validateAddresses validates each listener address
// if there are addresses set. Otherwise, returns no error.
func validateAddresses(addresses []gatewayv1a2.GatewayAddress, path *field.Path) field.ErrorList {
var errs field.ErrorList
re := regexp.MustCompile(`^([a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9]\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])\/[a-zA-Z0-9]+$`)

for i, a := range addresses {
if a.Type == nil {
continue
}
_, ok := addressTypesValid[*a.Type]
if !ok {
// Found something that's not one of the upstream AddressTypes
// Next, check for a domain-prefixed string
match := re.Match([]byte(*a.Type))
if !match {
errs = append(errs, field.Invalid(path.Index(i).Child("type"), a.Type, "should either be a defined constant or a domain-prefixed string (example.com/Type)"))
}
}

}
return errs
}
Copy link
Member

@shaneutt shaneutt Jun 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to suggest a micro-optimization here regarding the regex:

As written this will cause every call at runtime to validateAddresses to compile this regex. Being static however this regexp compiles to the same thing each time. This will cause work during runtime that doesn't need to be done, instead we can just make this var a glob above the function (and document it) and use it repeatedly.

For some illustration consider the following two programs:

package main

import "regexp"

func main() {
	for i := 0; i < 100000; i++ {
		match()
	}
}

func match() {
	re := regexp.MustCompile(`^([a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9]\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])\/[a-zA-Z0-9]+$`)
	_ = re.MatchString("asdf")
}
package main

import "regexp"

func main() {
	for i := 0; i < 100000; i++ {
		match()
	}
}

var re = regexp.MustCompile(`^([a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9]\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])\/[a-zA-Z0-9]+$`)

func match() {
	_ = re.MatchString("asdf")
}

When compiled to assembly (e.g. go tool compile -S main.go) you'll see that the first program will result in the regexp.MustCompile call being placed in .match and will be called repeatedly in the .main loop:

"".main STEXT size=71 args=0x0 locals=0x10 funcid=0x0 align=0x0
        ; e.t.c. (other instructions)
	0x0014 00020 (main.go:5)	XORL	AX, AX
	0x0016 00022 (main.go:6)	JMP	44
	0x0018 00024 (main.go:6)	MOVQ	AX, "".i(SP)
	0x001c 00028 (main.go:7)	PCDATA	$1, $0
	0x001c 00028 (main.go:7)	NOP
	0x0020 00032 (main.go:7)	CALL	"".match(SB)
	0x0025 00037 (main.go:6)	MOVQ	"".i(SP), AX
	0x0029 00041 (main.go:6)	INCQ	AX
	0x002c 00044 (main.go:6)	CMPQ	AX, $100000
	0x0032 00050 (main.go:6)	JLT	24
"".match STEXT size=103 args=0x0 locals=0x70 funcid=0x0 align=0x0
        ; e.t.c. (other instructions)
	rel 33+4 t=7 regexp.MustCompile+0

conversely, the second program will call it only once in .init:

"".init STEXT size=86 args=0x0 locals=0x18 funcid=0x0 align=0x0
        ; e.t.c. (other instructions)
	0x0020 00032 (main.go:11)	CALL	regexp.MustCompile(SB)

This is because go has no such optimization to comprehend that the input to MustCompile is actually static and then try to provide any optimization for that.

This may seem fairly negligible (which is why I call it a micro-optimization) but the timing results for these two variations have a fairly large gap:

$ time ./unoptimized

real	0m0.932s
user	0m1.006s
sys	0m0.038s

$ time ./optimized

real	0m0.073s
user	0m0.074s
sys	0m0.000s

Given our expectation that our validation code is going to run repeatedly all over the world for years to come, seems like a good a place as any to shave off some time (that would otherwise accumulate) for the very low price of moving the variable:

Suggested change
// validateAddresses validates each listener address
// if there are addresses set. Otherwise, returns no error.
func validateAddresses(addresses []gatewayv1a2.GatewayAddress, path *field.Path) field.ErrorList {
var errs field.ErrorList
re := regexp.MustCompile(`^([a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9]\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])\/[a-zA-Z0-9]+$`)
for i, a := range addresses {
if a.Type == nil {
continue
}
_, ok := addressTypesValid[*a.Type]
if !ok {
// Found something that's not one of the upstream AddressTypes
// Next, check for a domain-prefixed string
match := re.Match([]byte(*a.Type))
if !match {
errs = append(errs, field.Invalid(path.Index(i).Child("type"), a.Type, "should either be a defined constant or a domain-prefixed string (example.com/Type)"))
}
}
}
return errs
}
// domainPrefixedStringRegex is a regex used in validation to determine whether
// a provided string is a domain-prefixed string. Domain-prefixed strings are used
// to indicate custom (implementation-specific) address types.
var domainPrefixedStringRegex = regexp.MustCompile(`^([a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9]\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])\/[a-zA-Z0-9]+$`)
// validateAddresses validates each listener address
// if there are addresses set. Otherwise, returns no error.
func validateAddresses(addresses []gatewayv1a2.GatewayAddress, path *field.Path) field.ErrorList {
var errs field.ErrorList
for i, a := range addresses {
if a.Type == nil {
continue
}
_, ok := addressTypesValid[*a.Type]
if !ok {
// Found something that's not one of the upstream AddressTypes
// Next, check for a domain-prefixed string
match := domainPrefixedStringRegex.Match([]byte(*a.Type))
if !match {
errs = append(errs, field.Invalid(path.Index(i).Child("type"), a.Type, "should either be a defined constant or a domain-prefixed string (example.com/Type)"))
}
}
}
return errs
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's a great explanation! To be honest, kind of annoyed at myself I didn't see it. 😄 I'll get that change in now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I'll leave the comment unresolved though so others can easily see the explanation.

Copy link
Member

@shaneutt shaneutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 7, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shaneutt, youngnick

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 3805bf6 into kubernetes-sigs:master Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AddressType with domain-prefixed strings for "NamedAddress"
4 participants