Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag header based routing #6036

Merged
merged 3 commits into from
Jun 2, 2020

Conversation

igsong
Copy link
Contributor

@igsong igsong commented Nov 14, 2019

Proposed Changes

Currently, if a tag is attached to a specific traffic route, then the traffic route can be accessible via domain name. (e.g., tag-1-service-name.ns.svc.cluster.local). Since it utilizes a domain name to select a specific traffic and corresponding revision, it is very clearly understandable and easy to use.

With this settings, if a client application make a request to a Knative service and we want to make the application be able to choose a specific traffic route by configuration, the application should change the domain name part in the URI of the request. (e.g., if default URL is http://service-name.ns.svc, the tagged URL must be http://a-tag-service-name.ns.svc)

If the client statically choose a traffic route, it is OK. However, when we want more dynamic setting, it may cause complexity.

Let's assume that following situation:

  1. A service SVC-A calls the other service SVC-B, which also call a service SVC-C when a request is coming from a user
  2. QA analysts want to use the revision for each service, which has the tag testing. Some services may not have a tag testing because they does not have anything to test.
  3. (Of course,)Normal users need to use entire services with default URL scheme.

If we implement above use-case using current domain-name-based traffic route selection, when SVC-A receives a request with http://testing-svc-a.ns.svc, SVC-A needs to call SVC-B with http://testing-svc-b.ns.svc, and SVC-B needs to call SVC-C with http://testing-svc-c.ns.svc.

This way has two problem.

  1. If there is no tag for a service (in above situation, if SVC-B is not tagged with testing), the user request will be failed. Of course, we can always tag all the services without exception to keep consistency, it may cause mis-conception or increase complexity. (e.g., if we always tag a testing tag to all the services even though there are nothing to test, it may be confusing.)
  2. If we use multiple tags and they are dynamically created, when a service receives a request, the URL of subsequent requests needs to be manipulated according to Host header field of the received request. In addition to above example situtation, if we introduce a tag another-testing, when a user calls SVC-A with http://another-testing-svc-a.ns.svc, SVC-A needs to use URL http://another-testing-svc-b.ns.svc for calling SVC-B, instead of http://testing-svc-b.ns.svc. Since this requires tricky string manipulation logics for each application, it is basically not good. Even more, since the combination rule between the tag name and service name is configurable in config-network, it is very tough to require to develop a flexiable URL manipulation logics for every developers.

Therefore, I would like to propose another way to choose a specific traffic routing path, which is using a HTTP header to distinguish a tag name instead of using a domain name. When a user make a request with a default URL, it will follow the default routing rule as current implementation. If a request is coming with a default URL and a specific HTTP header (e.g., Knative-Serving-Tag: testing), it follows a routing path tagged with testing.
On the other hand, if a request has the HTTP header like Knative-Serving-Tag: testing, but the service does not have any routing path tagged with testing,
then the request follows just the original default routing path.

This feature is basically turned off, but if a Knative service or route includes networking.knative.dev/routeWithTagHeader: "true" as an annotation, it is enabled and the header-based routing rules are added in the Knative ingress as welll as the corresponding virtual service of Istio.

Highlight

  • HTTPIngressPath of Knative ingress is extended.
    • Pairs of header name and value (as regex) can be specified into HTTPIngressPath for Knative ingress as the field Headers. All the header name and values should be matched for a request to match with the HTTPIngressPath.
  • If a route object has annotation networking.knative.dev/routeWithTagHeader: "true", Reconciler for route creates additional HTTPIngressPath in Knative ingress object with respect to the tags and the corresponding traffic route entry
  • Reconciler for Knative ingress recognizes the extended format of Knative ingress, and generates Istio virtual service object according to its proper semantics.
  • If a request is coming through http://tag-name-svc-name.ns.svc (tagged domain name), the header Knative-Serving-Tag:tag-name is automatically appended to the request
  • If a request is coming through http://svc-name.ns.svc (default domain name) and it has a header Knative-Serving-Tag:tag-name, then the request is equivalently handled with a request coming though http://tag-name-svc-name.ns.svc.

Release Note


/lint

@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Nov 14, 2019
@knative-prow-robot knative-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/API API objects and controllers area/networking labels Nov 14, 2019
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igsong: 0 warnings.

In response to this:

Proposed Changes

Currently, if a tag is attached to a specific traffic route, then the traffic route can be accessible via domain name. (e.g., tag-1-service-name.ns.svc.cluster.local). Since it utilizes a domain name to select a specific traffic and corresponding revision, it is very clearly understandable and easy to use.

With this settings, if a client application make a request to a Knative service and we want to make the application be able to choose a specific traffic route by configuration, the application should change the domain name part in the URI of the request. (e.g., if default URL is http://service-name.ns.svc, the tagged URL must be http://a-tag-service-name.ns.svc)

If the client statically choose a traffic route, it is OK. However, when we want more dynamic setting, it may cause complexity.

Let's assume that following situation:

  1. A service SVC-A calls the other service SVC-B, which also call a service SVC-C when a request is coming from a user
  2. QA analysts want to use the revision for each service, which has the tag testing. Some services may not have a tag testing because they does not have anything to test.
  3. (Of course,)Normal users need to use entire services with default URL scheme.

If we implement above use-case using current domain-name-based traffic route selection, when SVC-A receives a request with http://testing-svc-a.ns.svc, SVC-A needs to call SVC-B with http://testing-svc-b.ns.svc, and SVC-B needs to call SVC-C with http://testing-svc-c.ns.svc.

This way has two problem.

  1. If there is no tag for a service (in above situation, if SVC-B is not tagged with testing), the user request will be failed. Of course, we can always tag all the services without exception to keep consistency, it may cause mis-conception or increase complexity. (e.g., if we always tag a testing tag to all the services even though there are nothing to test, it may be confusing.)
  2. If we use multiple tags and they are dynamically created, when a service receives a request, the URL of subsequent requests needs to be manipulated according to Host header field of the received request. In addition to above example situtation, if we introduce a tag another-testing, when a user calls SVC-A with http://another-testing-svc-a.ns.svc, SVC-A needs to use URL http://another-testing-svc-b.ns.svc for calling SVC-B, instead of http://testing-svc-b.ns.svc. Since this requires tricky string manipulation logics for each application, it is basically not good. Even more, since the combination rule between the tag name and service name is configurable in config-network, it is very tough to require to develop a flexiable URL manipulation logics for every developers.

Therefore, I would like to propose another way to choose a specific traffic routing path, which is using a HTTP header to distinguish a tag name instead of using a domain name. When a user make a request with a default URL, it will follow the default routing rule as current implementation. If a request is coming with a default URL and a specific HTTP header (e.g., Knative-Serving-Tag: testing), it follows a routing path tagged with testing.
On the other hand, if a request has the HTTP header like Knative-Serving-Tag: testing, but the service does not have any routing path tagged with testing,
then the request follows just the original default routing path.

This feature is basically turned off, but if a Knative service or route includes networking.knative.dev/routeWithTagHeader: "true" as an annotation, it is enabled and the header-based routing rules are added in the Knative ingress as welll as the corresponding virtual service of Istio.

Highlight

  • HTTPIngressPath of Knative ingress is extended.
  • Pairs of header name and value (as regex) can be specified into HTTPIngressPath for Knative ingress as the field Headers. All the header name and values should be matched for a request to match with the HTTPIngressPath.
  • If a route object has annotation networking.knative.dev/routeWithTagHeader: "true", Reconciler for route creates additional HTTPIngressPath in Knative ingress object with respect to the tags and the corresponding traffic route entry
  • Reconciler for Knative ingress recognizes the extended format of Knative ingress, and generates Istio virtual service object according to its proper semantics.

Release Note


/lint

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 14, 2019
@knative-prow-robot
Copy link
Contributor

Hi @igsong. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@igsong igsong changed the title [New Feature] Tag header based routing Tag header based routing Nov 14, 2019
@vagababov
Copy link
Contributor

This seems like a pretty big change if not a feature track.
But there is no issue for this, at least it's not referenced and second, I am not sure if this passed the networking and API WG discussions.

/hold
/assign @tcnghia @dgerd @mattmoor

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 14, 2019
@tcnghia
Copy link
Contributor

tcnghia commented Nov 14, 2019

Agreed with @vagababov. This has previously proposed here #4736 and there were concerns #4736 (comment). Please review/discuss this change with the API WG first (cc @dgerd @mattmoor).

@igsong
Copy link
Contributor Author

igsong commented Nov 14, 2019

If the change of ingress Spec. is problematic, I can make it without the change (by using annotation), but I felt the way is little bit tricky because the Ingress spec itself should address all the way of delivery of incoming requests, AFAIK.

@igsong
Copy link
Contributor Author

igsong commented Nov 22, 2019

@dgerd @mattmoor @tcnghia Just reminding this issue. If you have anything that I have to do for making it advance, please let me know.

@dgerd
Copy link

dgerd commented Nov 25, 2019

First of all, thanks for taking the time to put this change together and for the detailed proposal. I definitely see the challenges here. Given the size/significance of it I think it would be great to discuss further in the API Working Group and/or Networking Working Group. If the times don't work for you we can look into other options for discussion. Feel free to add this directly to the agenda in the meeting notes doc.

You have most of the details in your proposal above, but we also have a Feature Track Template which covers a few more areas (operations, observability, test, docs, and exit criteria). If you are able to transfer your ideas to the template I think it would help facilitate better discussion.

A few questions I have immediately are:

If there is no tag for a service (in above situation, if SVC-B is not tagged with testing), the user request will be failed. Of course, we can always tag all the services without exception to keep consistency, it may cause mis-conception or increase complexity. (e.g., if we always tag a testing tag to all the services even though there are nothing to test, it may be confusing.)

How is this solved by header tags? If SVC-A applies Knative-Serving-Tag:testing and SVC-B is not tagged with testing we have two options: (1) fail the request which matches the current behavior (2) follow the percent-based routing on the base domain. If we do the first option then we are where we started, and if we do the second option this seems surprising to me and difficult to debug. What if I accidentally sent Knative-Serving-Tag:test? It would seem like it is working, but producing the wrong behavior.

In addition to above example situtation, if we introduce a tag another-testing, when a user calls SVC-A with http://another-testing-svc-a.ns.svc, SVC-A needs to use URL http://another-testing-svc-b.ns.svc for calling SVC-B

So to rephrase this second problem: you want to forward along the 'tag' that was received, but do not have a good way to know what tag was received without heavy string manipulation and understanding how the domain is configured on your installation. Is that correct? Also it is not clear from the code, but am I correct that if multiple Knative-Serving-Tag headers are provided that the last one wins?

@igsong
Copy link
Contributor Author

igsong commented Dec 3, 2019

How is this solved by header tags? If SVC-A applies Knative-Serving-Tag:testing and SVC-B is not tagged with testing we have two options: (1) fail the request which matches the current behavior (2) follow the percent-based routing on the base domain. If we do the first option then we are where we started, and if we do the second option this seems surprising to me and difficult to debug. What if I accidentally sent Knative-Serving-Tag:test? It would seem like it is working, but producing the wrong behavior.

First of all, what I've intended is the second option.

Of course, as you said, such kind of accident can happen with this setting, but I think that picking the routing path for a specific revision by specifying the corresponding tag is usually used for internal usage of the APIs (e.g., testing, internal version matching...), not for the public usage. Therefore, I thought that such kind of strict error detection is not really required for this kind of features. Rather, at least in our company, I thought that it would be better to give a flexibility which provides that all the systems are integrated and working well even if a service does not have any proper tag for a request.

For details of the reason why we felt the needs of such flexibility, I want to explain about our idea first. Let's say that we have two projects simultaneously: Some teams are working on project-A, the other teams are working on project-B. And also assume that we have API services S1, S2, S3, S4 which have dependency each other, and each service pass the same ‘Knative-Serving-Tag’ header and value as it received when calling subsequent APIs of the other services. If S2 becomes S2' by the project-A, then the teams working on project-A would want to test with the combination of S1-S2'-S3-S4. On the other hand, the teams working on project B would want to test with the combination of S1-S2-S3-S4' if the project B changes S4 to S4'.
Of course we can give totally separated K8S cluster to the teams for each project. However, I thought that we can share a K8S cluster by using tags. If we give a tag 'project-A' to S2' and 'project-B' to S4', and apply this proposed change, project-A and project-B can be tested independently with one cluster by using tag headers project-A and project-B, respectively. Moreover, the cluster can serve normal user traffic as well.

At this point, if we does not permit the flexibility (routing to the default path when there is no matched tag), then we need to maintain the tags for all the service always, and it causes formidable complexity to our teams. (e.g., even if a service does not related to any other project, the service should have all the tags which all the related services have).

In addition, as you said, this may make the debug hard, but to detect the accident cases you told, I think that monitoring by prometheus or kiali may be used and seems enough.

BTW, for users who need strict tag existence checking, I can change the option flags like networking.knative.dev/routeWithTagHeader: "strict" for strict checking and networking.knative.dev/routeWithTagHeader: "loose" for my case.

@igsong
Copy link
Contributor Author

igsong commented Dec 3, 2019

So to rephrase this second problem: you want to forward along the 'tag' that was received, but do not have a good way to know what tag was received without heavy string manipulation and understanding how the domain is configured on your installation. Is that correct?

Yes, exactly.

@igsong
Copy link
Contributor Author

igsong commented Dec 3, 2019

Also it is not clear from the code, but am I correct that if multiple Knative-Serving-Tag headers are provided that the last one wins?

Basically, it depends on the header matching of istio VertualService, and that means it depends on header matching part of https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/route/route.proto#envoy-api-msg-route-routematch.

AFAIK, the header matching is success when there is any header which having matched key and value. So, if the multiple Knative-Serving-Tag are specified in a HTTP request, than it will follow a first rule matching with it, and the precedence of the rules are sorted by https://golang.org/pkg/sort/#Strings in

@igsong
Copy link
Contributor Author

igsong commented Dec 3, 2019

You have most of the details in your proposal above, but we also have a Feature Track Template which covers a few more areas (operations, observability, test, docs, and exit criteria). If you are able to transfer your ideas to the template I think it would help facilitate better discussion.

I've just requested the access permission for the document. Please approve it. :)

@vagababov
Copy link
Contributor

vagababov commented Dec 3, 2019 via email

@igsong
Copy link
Contributor Author

igsong commented Dec 3, 2019

@vagababov Thanks a lot!

@dgerd I'll try to fill up the feature track template in this week. It would be better to have a discussion after that. Thanks.

@igsong
Copy link
Contributor Author

igsong commented Dec 13, 2019

@dgerd I've written the proposal doc (https://drive.google.com/open?id=12t_3NE4EqvW_l0hfVlQcAGKkwkAM56tTn2wN_JtHbSQ). Please check it out.

@dgerd
Copy link

dgerd commented Dec 16, 2019

Thanks for putting the proposal together. I took a pass through it and added some others to weigh in.

@googlebot googlebot added cla: yes Indicates the PR's author has signed the CLA. and removed cla: no Indicates the PR's author has not signed the CLA. labels May 27, 2020
@knative-prow-robot knative-prow-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 27, 2020
@knative-metrics-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/networking/v1alpha1/ingress_types.go Do not exist 100.0%
pkg/network/network.go 82.9% 80.8% -2.1
pkg/reconciler/route/route.go 84.6% 84.7% 0.1

@igsong
Copy link
Contributor Author

igsong commented May 27, 2020

I've split this PR into two: the main feature + ingress change and observability.
The observability part is covered by #8103.

Copy link

@ZhiminXiang ZhiminXiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @igsong for driving this feature! It generally looks good to me. Left several comments.

@@ -221,6 +221,13 @@ type HTTPIngressPath struct {
// If RewriteHost is specified, Splits must not be.
RewriteHost string `json:"rewriteHost,omitempty"`

// Headers is a map from a header name to a header value which has a format of ECMAscript style regex.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we will start with "exact" match, could you update the comment here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

if name == traffic.DefaultTarget {
rule.HTTP.Paths[0].AppendHeaders[network.DefaultRouteHeaderName] = "true"
rule.HTTP.Paths = append(
makeTagBasedRoutingIngressPaths(r.Namespace, targets, names), rule.HTTP.Paths...)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to add some comments about what this ingress path is used for.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

pkg/reconciler/route/resources/ingress.go Outdated Show resolved Hide resolved
makeTagBasedRoutingIngressPaths(r.Namespace, targets, names), rule.HTTP.Paths...)
} else {

rule.HTTP.Paths[0].AppendHeaders[network.TagHeaderName] = name

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be also helpful to explain why we append header here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

@knative-metrics-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/networking/v1alpha1/ingress_types.go Do not exist 100.0%
pkg/network/network.go 82.9% 80.8% -2.1
pkg/reconciler/route/route.go 84.6% 84.7% 0.1

@knative-test-reporter-robot

The following jobs failed:

Test name Triggers Retries
pull-knative-serving-integration-tests pull-knative-serving-integration-tests 1/3

Automatically retrying due to test flakiness...
/test pull-knative-serving-integration-tests

@ZhiminXiang
Copy link

/lgtm

@tcnghia Could you please also take a look?

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label May 29, 2020
@tcnghia
Copy link
Contributor

tcnghia commented Jun 1, 2020

Currently we don't have a guidance for what goes in config-features. @JRBANCEL is working on that. Once that is settled I think we may have to move the feature flag there (ideally before we cut the release).

For now let's check this in to unblock progress.

/approve

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: igsong, tcnghia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 1, 2020
@tcnghia
Copy link
Contributor

tcnghia commented Jun 1, 2020

/hold cancel

@knative-prow-robot knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 1, 2020
@igsong
Copy link
Contributor Author

igsong commented Jun 2, 2020

/retest

@ZhiminXiang
Copy link

/test pull-knative-serving-upgrade-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/monitoring area/networking cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.