-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support URI Template based route match #7763
Comments
RFC 6570 is great for creating URIs, but seems mum on actual matching patterns. What would be the plan for turning URI Templates into matchers? I ask because we're in need of a more flexible route-matching, but are trying to avoid exposing the complexity (both in user-friendliness and in runtime speed) of regexes. We're thinking of building something based on "glob" operators:
The gRPC transcoder extends the RFC 6570 syntax so that expansion expressions can take matcher expressions. The matcher expressions are of the form gRPC's We're not wedded to our particular syntax, but do need globbing, so something like gRPC could work for us. What's the best way to move forward? |
http.proto with @mattklein123 @lizan @markdroth @envoyproxy/api-shepherds WDYT? Also tagging @louiscryan @costinm for Istio. |
I talked to @htuch about this offline and in general I'm in favor. I don't feel very strongly about the language so I would go with whatever the simplest thing is that satisfies the known use cases. In terms of Envoy implementation/API details, I think the easiest thing to do is going to be to implement a new regex engine here: envoy/api/envoy/type/matcher/v3/regex.proto Lines 49 to 53 in b23ee3b
Each engine implicitly has a supported language that is self enforced. I think we could just implement a new engine which has a very limited language and offer that as an option. I admittedly haven't thought through all the details, but I think this will be very simple from an API perspective to get working. |
One other consideration of Justins example above is that the matcher emits an attribute/metadata by tagging match segments. This is useful for rewrites on forward and has uses in other policy contexts. Same would be true for regex groups. |
Yep, I think this is a pretty powerful approach. @louiscryan can we confirm that from the Istio-side that |
It would be ideal to have first-class support for URI templates in Envoy. For the time being, our product ESPv2 converts URI templates into regex route matchers when configuring the router.
#13320 added performance benchmarks for Envoy's router with regex routes that emulate common OpenAPI path templates. We decided the performance impact was acceptable and implemented our own URI template to regex converter in golang. This approach has been working fine so far.
Before we switched to converting URI templates to regex route matchers during config time, we has a custom filter and C++ library that supported URI templates in the data plane. It is similar to the library currently used by the gRPC transcoder filter. The trie-based implementation allows However, we moved away from this approach because the Envoy platform team had security concerns about introducing a handwritten parser into Envoy's dataplane.
It sounds like the current approach is to add first-class support for URI templates in the route configuration, but internally envoy will still use regex matchers to parse the URI template. Can we move to using the trie-based parser library and remove the intermediate regex? Otherwise I do not see much value added - users can convert the URI templates to regex at config time like we do currently, it's just moving that logic into Envoy. |
Per my understanding of mattklein123 idea of new engine_type inside RegexMatcher, it is NOT using regex. This new type can be implemented by grpc path_matcher which supports full URL template syntax. This change is very isolated, will not impact other code. I think its a good idea, it will achieve the same goal of not using regex to implement url_template. For example
|
Implementation wise I'm good with either internal safe_regex translation or the gRPC-JSON transcoder library if we can persuade ourselves it is robust here. @qiwzhang points is correct, Matt is suggesting we use the pluggable "regex" API message, not that this is actually a regex. @qiwzhang what is you take on fuzzing and code quality/review for the path matcher? Either way we go, we have code that requires tests, fuzzing and review. In particular, the tests/fuzzing are basically common if we treat this like a blackbox. |
Yes, the name path_matcher code quality should be ok. It already has fuzz tests for its http_template code, we can add fuzz tests for path_matcher code too. |
I think Url Template will be used only for RouteMatch. Instead adding a new engine_type for RegexMatch, we can add How about:
|
path_matcher trie-based implementation may have potential of achieve O(1) speed with n RouteMatch entries. If n of RouteMatch entries use the new url_template, they can build into one patch_match tree. There will be one tree lookup for all n route entries. For example, if we have 3 route entries with:
All of above 3 url_template route entries can be build into one path_match tree. Just one tree lookup, it can tell if there is any match. Instead of O(n) for route match search, it will be O(1). Such optimization could be transparent to users, it is just an internal implementation detail. |
I'm a big fan of tries for URL matching and there's no doubt it is faster than iterating a list of regex's, but FWIW, I'm pretty sure this is not O(1), it's O(height of tree) isn't it? I believe in the worst case it is basically O(n) e.g.
The trie is still a bit better than the worst case of iterating a list of regex's which would be O(n*m), where m is the avg complexity of evaluating a single regex. Of course the trie does even better in real life, but claiming the trie is always constant time is misleading. |
Yes, you are correct. @josheinhorn. It is O(h), h = height of the tree, which is the maximum number of path segments from all url_templates. |
FWIW, I would really prefer not to encode this in xDS as a type of regex, since it's really not -- it's a completely different template-based matching syntax, so calling it a regex is very confusing. Instead, I think we should make this another type inside of |
+1 to not to encode this in xDS as a type of regex. Also I think this is not as simple as adding new matcher type. The HTTP URI template path matcher need all matching rules to build the trie, so it doesn't work well with current first-match-wins rule. Perhaps we need the tree based route matcher first? cc @snowp |
Hi @lizan , here is my idea of grouping multiple url_template matches into a path_matcher _tree; not all route entries need to use url_template, the route order need to be preserved too. so only group the adjacent url_template route entries into a tree. For example
We will group match2 and match3 into a tree, and match5 into another tree. |
My suggestion to put it in the regex message was for API simplicity. I don't feel that strongly about it. If it's easy to put it elsewhere we can do that. My suggestion would be to not intertwine with FIFO vs. O(1). Can't we add this with implicit FIFO support first and then deal with the sub-linear support when we add full sub-linear tree matching for main route selection? |
Yes, sub-linear should be deferred to when we take the generalized matchers from @snowp and bring them to RouteConfiguration IMHO. |
@qiwzhang In that case you can't build the URI matcher in the same trie? It is not ideal works though. |
OK, for now, I will just implement one path_matcher tree per one url_template, not to combine multiple ones into one tree. |
#15493 updates the router benchmarks for URL templates. Surprisingly, the URL template path matcher is much slower than regex based matching. Most likely because route matches are FIFO (as noted above). So we search a separate tree for every single route instead of a single tree. |
@nareddyt I understand from #7763 (comment) the possible advantage of grouping related matches into a tree, but in #15493 we're racing RE2 and the path matcher head:head, surely they should, for a single match, be comparable? |
@qiwzhang - since #15299 tackles the path matching only - do you also intend to include expanding the RouteAction to support template-based rewrites as part of this overall effort? We may also need to clarify how |
Looking at #15299 taught me a bunch about the syntax we may want to allow. Two points came out of that:
With those lessons in mind, here's a more detailed syntax proposal that similar to the http.proto syntax, but fixes these problems. Match Patterns and TemplatesWe propose wildcard support based on match patterns and templates. A match pattern matches an incoming URL path. Template patterns are used to re-write URLs. Match patterns support glob operators to match URL text and variable definitions to bind matched text to names. Template patterns build new URLs and may reference variables bound by a match pattern. Syntax
An A The The text_glob operator matches 0 or more characters in the set of { ExamplesAllowed Rules:
Invalid Rules:
|
SGTM; @lizan @markdroth @mattklein123 @qiwzhang @josheinhorn @nareddyt are you folks happy to commit to this as PoR? |
SGTM |
Note that there are some extensions to this grammar that we could implement in the future by broadening the restricted syntax proposed above:
I left these extensions out of the proposal above because they do make things more complicated, they're incompatible with http.proto, and no one is asking for them (yet?). However, I believe they would be backwards-compatible changes. |
LGTM except one part:
This gives me pause since it's a divergence from the existing syntax, and I don't entirely see how the inclusion of periods in the variable names conflicts with periods in the value. For example, what is wrong with |
Would be very useful to be able to use dynamic Metadata or header values I the re-write rules |
URL Rewrite: Match and Rewrite Library Additional Description: This PR will add library to implement issue detailed here and described below: #7763 Full set of changes to implement feature are found here: #22207 Match Patterns and Templates Wildcard support based on match patterns and templates. A match pattern matches an incoming URL path. Match patterns support glob operators to match URL text and variable definitions to bind matched text to names. Template patterns are used to re-write URLs. Template patterns build new URLs and may reference variables bound by a match pattern. Match Examples /.m3u8 would match /foo.m3u8 and /foo/bar.m3u8. /{dir_name}/*.ts would match /example/file.ts and bind dir_name="example" for a later template match to use. /{dir_name}/.ts would match /example/path/file.ts and bind dir_name="example" for a later template match to use. This would also match /example/.ts, which may or may not be a desired behavior. /{path=v1/}/{file=.ts} would match /v1/example/movie.ts (binding path="v1/example" and file="movie"), but would not match /v0/example/movie.ts. See post for full details and example: #7763 (comment) Risk Level: Testing: Unit tests. Both both internal matching/rewrite library and config/data plane changes.
This PR will implement issue detailed here and described below: #7763 Match Patterns and Templates Wildcard support based on match patterns and templates. A match pattern matches an incoming URL path. Match patterns support glob operators to match URL text and variable definitions to bind matched text to names. Template patterns are used to re-write URLs. Template patterns build new URLs and may reference variables bound by a match pattern. Match Examples /**.m3u8 would match /foo.m3u8 and /foo/bar.m3u8. /{dir_name}/*.ts would match /example/file.ts and bind dir_name="example" for a later template match to use. /{dir_name}/**.ts would match /example/path/file.ts and bind dir_name="example" for a later template match to use. This would also match /example/.ts, which may or may not be a desired behavior. /{path=v1/*}/{file=*.ts} would match /v1/example/movie.ts (binding path="v1/example" and file="movie"), but would not match /v0/example/movie.ts. See post for full details and example: #7763 (comment) Risk Level: Testing: Unit tests. Both both internal matching/rewrite library and config/data plane changes. Signed-off-by: silverstar195 <[email protected]>
Would be awesome to have that supported. |
Description:
This was a part of discussion for #7728 but a separate issue, to add full or subset of URI Template (RFC6570) based route matching. The URI Template is already used in OpenAPI spec and gRPC JSON transcoding binding.
It will reduce the use case for regex as I see many use of regex is to support this.
The text was updated successfully, but these errors were encountered: