-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scalable route configuration #6602
Comments
I've thought about this for a while and IMO we need to do (3) which is to have a parallel configuration which supports a much more limited set of matching options designed for sub-linear execution. Optimally, the configuration would allow embedding a "legacy" route configuration inside of it so that once a sub-linear match is done, the user can further do a linear match if desired for extra features. |
I think a combination of (2) and (3) would work. Not sure re: full-on regex implementation in (2), but some form of simple pattern (a subset of bash pattern-matching perhaps?)/prefix and/or suffix matching might be sufficient for our purposes. |
I've been messing around with RE2 vs std::regex benchmarking on my Mac on the plane, and there's a pretty significant speedup. This includes dramatic improvements executing regular expressions that are thought of as pathologically slow. I will publish a bit later. However, I think sublinear is the only way to scale, one way or another we need to get to at trie. I was also thinking it might be possible in some cases to compile a restricted subset of existing Envoy config into a trie, but I haven't sorted out the details. There's some value in being able to simply make the existing configurations run faster. This is because there are layers (e.g. Istio) that already mirror the existing Envoy configs. Going with (3) would be easier though (less of a science project). |
Just for the record: Intel Hyperscan is specifically designed to match strings against a set of regular expressions; and (I believe) it uses vector instructions, so constants should be good too. |
@htuch @jmarantz Would you consider this a blocker before implementing nginx style rewrite rules via rds.RedirectAction ? The final comment on a PR addressing 2092 was closed with the author suggesting that a prefix trie should probably be preferred over std::regex . |
@cmluciano no, I think that feature can be independently added. |
FWIW, I have an initial attempt at a design for this at https://docs.google.com/document/d/1orxTIL9FXtgmyl5TtPRBqGjgp1ekL7oOKDd95wxeCRY/edit#. |
Re. implementation of regex matching... I think it's possible to speed up even a linear list of matchers if to scan an input string against all the regexes used in the list at once. Here we loose the ability to craft manually an optimized list by scanning the regexes lazily, but enable HW optimizations. RE2 has got the Hyperscan is designed to perform matches against many patterns: it builds a single NFA, decomposes to subcomponents, reduces redundancies, translates the subcomponents to various implementations of DFA, NFA and literal matchers depending on what suites better. The literal matchers are SIMD accelerated. DFA/NFA subcomponents are SIMD accelerated when possible. The problem is this compilation takes more time than in case of RE2. But memory consumption is less and guaranteed to be constant at scan time. Also for SIMD acceleration at least SSSE3 is required. |
I think with MatcherTrees, it's possible to have a map from regex to next nodes. In that situation, you could build a single NFA, subject to the limits you describe. CC @snowp |
Hi Envoy team, what is the current status of MatcherTree feature and any timeline for GA? A little background, we have 1k+ microservices and plan to build a global service mesh. the O(N) routing match algorithm isn't scalabe. |
Just another quick question, Does scoped routing configuration match key to route with O(1) or O(n) algorithm? Thanks! |
FWIW, in our Thrift mesh we worked around this via #8994. cc: @fishcakez |
@snowp @mattklein123 should we consider this work closed by the generic matchers? |
Does it allow making sub-linear route matching for action selection? (e.g. we are interested in an ability to select cluster & add some per-route metadata for stats/access logs) |
Actually, scratch my comment, I've just noticed that envoy has generic matchers support for routing, which covers my use case. |
Yeah we should just go ahead and mark this stable IMO. It's been out for long enough. cc @snowp @kyessenov |
Looking more into the implementation, it looks like current matcher api in route config has no support for runtime fractions (as well as other existing matcher flavors). It's somehow complicated by the fact that it does not support embedding existing routing config (or a list of routes) as a final leaf as well. Should a new issue (issues) be created to track it? |
I don't know where we ended up, but I know that the original intention was for the original route implementation to be usable as a leaf node. I thought this was already implemented but maybe not. cc @snowp @kyessenov |
This bug is still open. Does it mean there are plans to address it? Specifically for the following mentioned in the first comment.
This is something we would be really interested in (unless we can achieve this through the generic matcher API?) |
The existing Envoy
RouteConfiguration
must be evaluated linearly and contains match criteria with hard to predict execution time (regular expressions). For scale-up Envoy instances where very large route configurations for a host may be present (let's say O(1m)) or multi-tenant Envoy instances, this becomes problematic.There are many ways this could be solved:
RouteConfiguration
with trie-like semantics, where evaluation order is designed for efficiency.Some combination of (2) and (3) seems likely to yield a clean solution IMHO, but I'm keen to hear from others.
@jmarantz @mattklein123 @dmitri-d @yanavlasov
The text was updated successfully, but these errors were encountered: