Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(router) regex path matching not considered with exact path match #4153

Closed
wants to merge 1 commit into from

Conversation

bungle
Copy link
Member

@bungle bungle commented Jan 4, 2019

Summary

This is a sister-PR for #4152, and makes similar adjustments to path matching.

The added test case shows the problem.

@bungle bungle mentioned this pull request Jan 4, 2019
@bungle bungle changed the title fix(router) match path prefixes prior path regexes fix(router) regex matching not considered with exact host match Jan 4, 2019
@bungle bungle changed the title fix(router) regex matching not considered with exact host match fix(router) regex path matching not considered with exact path match Jan 4, 2019
assert.equal(routes[2].id, res.headers["kong-route-id"])
assert.equal(routes[2].service.id, res.headers["kong-service-id"])
assert.equal(routes[2].service.name, res.headers["kong-service-name"])
end)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is a duplicate of this unit test. Btw, it is preferred that router tests be added to the unit test suite pointed above, instead here. This test suite, while it may have some overlap with the above unit suite, relates more to integration of the router within Kong at runtime.

@@ -1007,31 +1009,33 @@ function _M.new(routes)

-- uri match

if plain_indexes.uris[req_uri] then
req_category = bor(req_category, MATCH_RULES.URI)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change could be highly detrimental in many cases.

@@ -709,7 +710,8 @@ do
end,

[MATCH_RULES.URI] = function(category, ctx)
return category.routes_by_uris[ctx.hits.uri or ctx.req_uri]
return category.routes_by_uris[ctx.req_uri] or
category.routes_by_uris[ctx.hits.uri]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change can be detrimental in many cases (that is, instead of taking the hint into consideration, and prioritizing the evaluation of Routes with the hinted path, we now overlook the hint and perform the lookup based on the request path. The hint serves no purpose anymore.

@@ -522,7 +522,8 @@ do

[MATCH_RULES.URI] = function(route_t, ctx)
do
local uri_t = route_t.uris[ctx.hits.uri or ctx.req_uri]
local uri_t = route_t.uris[ctx.req_uri] or
route_t.uris[ctx.hits.uri]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto: while the impact is lesser, we still disregard the hint (which serves no purpose anymore) in favor of the request path.

assert.equal(routes[3].id, res.headers["kong-route-id"])
assert.equal(routes[3].service.id, res.headers["kong-service-id"])
assert.equal(routes[3].service.name, res.headers["kong-service-name"])
end)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reference: after testing this on the master branch and seeing it pass, it has become evident to me that this test is to assert we preserve the behavior broken by the removal at line 1011 of the router. Maybe this test could be added to the unit test suite pointed above, regardless of the future of this PR?

remove_routes(routes)
end)

it("matches regex path even when there is exact and prefix match", function()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can see that the second Route in this test's fixture data set defines a host, so I don't think this is a valid test. In fact, the previous test case already establishes that an exact match is favored vs a regex match. This test should probably be about prefix match vs regex matches.

@thibaultcha
Copy link
Member

It seems to me like we are better off renaming the Route's priority attribute and let the user decide for themselves where regex routes have a priority. Not only is it more intuitive, but also avoids the performance degradations noted here, it has also made (almost) consensus in #4121.

@thibaultcha thibaultcha added pr/do not merge pr/discussion This PR is being debated. Probably just a few details. and removed pr/please review labels Jan 8, 2019
@bungle
Copy link
Member Author

bungle commented Jan 9, 2019

@thibaultcha this is not about priority, but about that router skips routes when there are multiple matching rules.

@bungle
Copy link
Member Author

bungle commented Jan 9, 2019

What I mean that the current router matches route with rule x, then it thinks this is a route to only after further checks finds out that not it doesn't match with rule y of the same route, and then it is 404. Even when there is another route that matches the request.

@thibaultcha
Copy link
Member

Replaced by #4775

@thibaultcha thibaultcha closed this Jul 3, 2019
@thibaultcha thibaultcha deleted the fix/router-uri-matching branch July 3, 2019 21:43
thibaultcha added a commit that referenced this pull request Jul 3, 2019
This is the third commit of a series of fixes to the router.

Background
----------

As established a couple of commits ago, the rules for routing priorities
are:

1. Routes with more matching attributes take precedence over Routes with
   fewer matching attributes
2. Plain hosts take precedence over wildcard hosts
3. Regex path always takes precedence over plain paths
4. Longer plain paths take precedence over shorter plain paths

**Note:** as of today, our documentation is not reflecting the router's
behavior. As of July 3rd, 2019, the proxy guide
(https://docs.konghq.com/1.2.x/proxy/#evaluation-order) states that
plain paths are evaluated before regex paths, and implies that the
former have priority over the latter, which is not correct as per the
router behavior, and asserted as per a series of established test cases.

As per the above rules, consider the following simplified scenario:

    r1:
        paths = { "/pat" }

    r2:
        paths = { "/(path)" }

    local matched_route = router.select("GET", "/path", "route.com")
    -- matched_route == r2

Issue
-----

Now, consider the following scenario:

    r1:
        paths = { "/pat" }
        hosts = { "route.com" }

    r2:
        paths = { "/(path)" }
        hosts = { "route.com" }

    local matched_route = router.select("GET", "/path", "route.com")
    -- /!\ matched_route should still be r2, but prior to this patch, was r1

The above case highlights a flaw in the router: that categorized Routes
are not sorted as per rules 2., 3., and 4. (aforementioned). This would
not be an issue if regex paths and wildcard hosts were matching
categories of their own, but for historical reasons, they are not, at
least at the moment.

Note: said reasons include the lack of dynamic categories lists support in the
initial router implementation, something that was fixed in
1ba8ab2 prior to the introduction of
SNI/src/dst routing. This is outside the scope of this patch.

Therefore, this patch ensures that all categorized Routes are thus
sorted appropriately relative to each other within their respective
categories.

Note
----

It is worth noting that this patch can be breaking in its nature, given
that the router behavior highlighted by the new test will differ before
and after the patch.

This patch replaces the one in #4153,
which is more expensive and relies on indexes to fix the issue, instead
of fixing the ordering. It was also lacking unit tests, which are more
appropriate (and mandatory) for router changes.

Replaces #4153
thibaultcha added a commit that referenced this pull request Jul 15, 2019
This is the third commit of a series of fixes to the router.

Background
----------

As established a couple of commits ago, the rules for routing priorities
are:

1. Routes with more matching attributes take precedence over Routes with
   fewer matching attributes
2. Plain hosts take precedence over wildcard hosts
3. Regex path always takes precedence over plain paths
4. Longer plain paths take precedence over shorter plain paths

**Note:** as of today, our documentation is not reflecting the router's
behavior. As of July 3rd, 2019, the proxy guide
(https://docs.konghq.com/1.2.x/proxy/#evaluation-order) states that
plain paths are evaluated before regex paths, and implies that the
former have priority over the latter, which is not correct as per the
router behavior, and asserted as per a series of established test cases.

As per the above rules, consider the following simplified scenario:

    r1:
        paths = { "/pat" }

    r2:
        paths = { "/(path)" }

    local matched_route = router.select("GET", "/path", "route.com")
    -- matched_route == r2

Issue
-----

Now, consider the following scenario:

    r1:
        paths = { "/pat" }
        hosts = { "route.com" }

    r2:
        paths = { "/(path)" }
        hosts = { "route.com" }

    local matched_route = router.select("GET", "/path", "route.com")
    -- /!\ matched_route should still be r2, but prior to this patch, was r1

The above case highlights a flaw in the router: that categorized Routes
are not sorted as per rules 2., 3., and 4. (aforementioned). This would
not be an issue if regex paths and wildcard hosts were matching
categories of their own, but for historical reasons, they are not, at
least at the moment.

Note: said reasons include the lack of dynamic categories lists support in the
initial router implementation, something that was fixed in
1ba8ab2 prior to the introduction of
SNI/src/dst routing. This is outside the scope of this patch.

Therefore, this patch ensures that all categorized Routes are thus
sorted appropriately relative to each other within their respective
categories.

Note
----

It is worth noting that this patch can be breaking in its nature, given
that the router behavior highlighted by the new test will differ before
and after the patch.

This patch replaces the one in #4153,
which is more expensive and relies on indexes to fix the issue, instead
of fixing the ordering. It was also lacking unit tests, which are more
appropriate (and mandatory) for router changes.

Replaces #4153
thibaultcha added a commit that referenced this pull request Jul 15, 2019
This is the third commit of a series of fixes to the router.

Background
----------

As established a couple of commits ago, the rules for routing priorities
are:

1. Routes with more matching attributes take precedence over Routes with
   fewer matching attributes
2. Plain hosts take precedence over wildcard hosts
3. Regex path always takes precedence over plain paths
4. Longer plain paths take precedence over shorter plain paths

**Note:** as of today, our documentation is not reflecting the router's
behavior. As of July 3rd, 2019, the proxy guide
(https://docs.konghq.com/1.2.x/proxy/#evaluation-order) states that
plain paths are evaluated before regex paths, and implies that the
former have priority over the latter, which is not correct as per the
router behavior, and asserted as per a series of established test cases.

As per the above rules, consider the following simplified scenario:

    r1:
        paths = { "/pat" }

    r2:
        paths = { "/(path)" }

    local matched_route = router.select("GET", "/path", "route.com")
    -- matched_route == r2

Issue
-----

Now, consider the following scenario:

    r1:
        paths = { "/pat" }
        hosts = { "route.com" }

    r2:
        paths = { "/(path)" }
        hosts = { "route.com" }

    local matched_route = router.select("GET", "/path", "route.com")
    -- /!\ matched_route should still be r2, but prior to this patch, was r1

The above case highlights a flaw in the router: that categorized Routes
are not sorted as per rules 2., 3., and 4. (aforementioned). This would
not be an issue if regex paths and wildcard hosts were matching
categories of their own, but for historical reasons, they are not, at
least at the moment.

Note: said reasons include the lack of dynamic categories lists support in the
initial router implementation, something that was fixed in
1ba8ab2 prior to the introduction of
SNI/src/dst routing. This is outside the scope of this patch.

Therefore, this patch ensures that all categorized Routes are thus
sorted appropriately relative to each other within their respective
categories.

Note
----

It is worth noting that this patch can be breaking in its nature, given
that the router behavior highlighted by the new test will differ before
and after the patch.

This patch replaces the one in #4153,
which is more expensive and relies on indexes to fix the issue, instead
of fixing the ordering. It was also lacking unit tests, which are more
appropriate (and mandatory) for router changes.

Replaces #4153
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/discussion This PR is being debated. Probably just a few details.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants