Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalable route configuration #6602

Open
htuch opened this issue Apr 16, 2019 · 20 comments
Open

Scalable route configuration #6602

htuch opened this issue Apr 16, 2019 · 20 comments
Assignees
Labels
area/http design proposal Needs design doc/proposal before implementation help wanted Needs help!

Comments

@htuch
Copy link
Member

htuch commented Apr 16, 2019

The existing Envoy RouteConfiguration must be evaluated linearly and contains match criteria with hard to predict execution time (regular expressions). For scale-up Envoy instances where very large route configurations for a host may be present (let's say O(1m)) or multi-tenant Envoy instances, this becomes problematic.

There are many ways this could be solved:

  1. Envoy could optimize based on presented configuration, for example a trie could be recovered from repeated patterns or a specific match order. This is challenging if the existing linear semantics are to be preserved and involves Envoy complexity.
  2. We could have variants of regular expressions, e.g. prefix/suffix/glob or RE2 that are cheaper and more predictable to evaluate.
  3. We could have a parallel RouteConfiguration with trie-like semantics, where evaluation order is designed for efficiency.

Some combination of (2) and (3) seems likely to yield a clean solution IMHO, but I'm keen to hear from others.

@jmarantz @mattklein123 @dmitri-d @yanavlasov

@htuch htuch added the design proposal Needs design doc/proposal before implementation label Apr 16, 2019
@mattklein123
Copy link
Member

I've thought about this for a while and IMO we need to do (3) which is to have a parallel configuration which supports a much more limited set of matching options designed for sub-linear execution. Optimally, the configuration would allow embedding a "legacy" route configuration inside of it so that once a sub-linear match is done, the user can further do a linear match if desired for extra features.

@dmitri-d
Copy link
Contributor

dmitri-d commented Apr 18, 2019

I think a combination of (2) and (3) would work. Not sure re: full-on regex implementation in (2), but some form of simple pattern (a subset of bash pattern-matching perhaps?)/prefix and/or suffix matching might be sufficient for our purposes.

@jmarantz
Copy link
Contributor

I've been messing around with RE2 vs std::regex benchmarking on my Mac on the plane, and there's a pretty significant speedup. This includes dramatic improvements executing regular expressions that are thought of as pathologically slow. I will publish a bit later.

However, I think sublinear is the only way to scale, one way or another we need to get to at trie.

I was also thinking it might be possible in some cases to compile a restricted subset of existing Envoy config into a trie, but I haven't sorted out the details. There's some value in being able to simply make the existing configurations run faster. This is because there are layers (e.g. Istio) that already mirror the existing Envoy configs.

Going with (3) would be easier though (less of a science project).

@htuch
Copy link
Member Author

htuch commented Apr 18, 2019

@louiscryan

@rnd4222
Copy link

rnd4222 commented May 6, 2019

Just for the record: Intel Hyperscan is specifically designed to match strings against a set of regular expressions; and (I believe) it uses vector instructions, so constants should be good too.

@cmluciano
Copy link
Member

@htuch @jmarantz Would you consider this a blocker before implementing nginx style rewrite rules via rds.RedirectAction ?

The final comment on a PR addressing 2092 was closed with the author suggesting that a prefix trie should probably be preferred over std::regex .

@mattklein123
Copy link
Member

@cmluciano no, I think that feature can be independently added.

@htuch
Copy link
Member Author

htuch commented Aug 27, 2019

FWIW, I have an initial attempt at a design for this at https://docs.google.com/document/d/1orxTIL9FXtgmyl5TtPRBqGjgp1ekL7oOKDd95wxeCRY/edit#.

@rojkov
Copy link
Member

rojkov commented Jan 14, 2021

Re. implementation of regex matching... I think it's possible to speed up even a linear list of matchers if to scan an input string against all the regexes used in the list at once. Here we loose the ability to craft manually an optimized list by scanning the regexes lazily, but enable HW optimizations.

RE2 has got the RE2::Set interface that can be used to build a single NFA out of many regexes. Then one scan is enough to mark all the predicates in the list either true or false. There is a nuance though: RE2 builds DFA lazily for explored states at scan time and caches the result to reuse for future scans. This make memory consumption unpredictable and may trigger error thresholds.

Hyperscan is designed to perform matches against many patterns: it builds a single NFA, decomposes to subcomponents, reduces redundancies, translates the subcomponents to various implementations of DFA, NFA and literal matchers depending on what suites better. The literal matchers are SIMD accelerated. DFA/NFA subcomponents are SIMD accelerated when possible. The problem is this compilation takes more time than in case of RE2. But memory consumption is less and guaranteed to be constant at scan time. Also for SIMD acceleration at least SSSE3 is required.

@htuch
Copy link
Member Author

htuch commented Jan 15, 2021

I think with MatcherTrees, it's possible to have a map from regex to next nodes. In that situation, you could build a single NFA, subject to the limits you describe. CC @snowp

@manlinl
Copy link

manlinl commented Nov 24, 2021

Hi Envoy team, what is the current status of MatcherTree feature and any timeline for GA?

A little background, we have 1k+ microservices and plan to build a global service mesh. the O(N) routing match algorithm isn't scalabe.

@manlinl
Copy link

manlinl commented Nov 30, 2021

Just another quick question, Does scoped routing configuration match key to route with O(1) or O(n) algorithm? Thanks!

@rgs1
Copy link
Member

rgs1 commented Nov 30, 2021

Hi Envoy team, what is the current status of MatcherTree feature and any timeline for GA?

A little background, we have 1k+ microservices and plan to build a global service mesh. the O(N) routing match algorithm isn't scalabe.

FWIW, in our Thrift mesh we worked around this via #8994. cc: @fishcakez

@htuch
Copy link
Member Author

htuch commented Nov 24, 2022

@snowp @mattklein123 should we consider this work closed by the generic matchers?

@euroelessar
Copy link
Contributor

@snowp @mattklein123 should we consider this work closed by the generic matchers?

Does it allow making sub-linear route matching for action selection? (e.g. we are interested in an ability to select cluster & add some per-route metadata for stats/access logs)

@euroelessar
Copy link
Contributor

Actually, scratch my comment, I've just noticed that envoy has generic matchers support for routing, which covers my use case.
Should marking it as stable be considered as a pre-requisite for closing the issue though?

@mattklein123
Copy link
Member

Actually, scratch my comment, I've just noticed that envoy has generic matchers support for routing, which covers my use case.
Should marking it as stable be considered as a pre-requisite for closing the issue though?

Yeah we should just go ahead and mark this stable IMO. It's been out for long enough. cc @snowp @kyessenov

@euroelessar
Copy link
Contributor

Looking more into the implementation, it looks like current matcher api in route config has no support for runtime fractions (as well as other existing matcher flavors). It's somehow complicated by the fact that it does not support embedding existing routing config (or a list of routes) as a final leaf as well.

Should a new issue (issues) be created to track it?

@mattklein123
Copy link
Member

It's somehow complicated by the fact that it does not support embedding existing routing config (or a list of routes) as a final leaf as well.

I don't know where we ended up, but I know that the original intention was for the original route implementation to be usable as a leaf node. I thought this was already implemented but maybe not. cc @snowp @kyessenov

@sanjaypujare
Copy link
Contributor

This bug is still open. Does it mean there are plans to address it? Specifically for the following mentioned in the first comment.

  1. We could have a parallel RouteConfiguration with trie-like semantics, where evaluation order is designed for efficiency.

This is something we would be really interested in (unless we can achieve this through the generic matcher API?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/http design proposal Needs design doc/proposal before implementation help wanted Needs help!
Projects
None yet
Development

No branches or pull requests