Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local rate limit: add new rate_limits support to the filter #36099

Merged
merged 25 commits into from
Oct 5, 2024

Conversation

wbpcode
Copy link
Member

@wbpcode wbpcode commented Sep 12, 2024

Commit Message: local rate limit: add new rate_limits api to the filter's api
Additional Description:

In the previous local rate limit, the rate_limits field of route is used to generate the descriptor entries. Then the generated entries will be used to match a token bucket which is configured in the filter configs (route level, vhost level, etc).

However, it make the configuration very complex, and cannot cover some common scenarios easily. For example, give a specific virtual host X and a special route Y that under this virtual host X.

We want to provides a virtual host level rate limit for the specific virtual host X, and a route level rate limit for the specific route Y. We hope the configuration of virtual host could works for all routes except the Y.

For most filters, this requirement could be achieved by getting the most specific filter config and applying it. But for the local rate limit, thing become very complex. Because the rate limit configuration is split into rate_limits field of route and the filter config. The local rate limit need to handle these relationship carefully.

This PR try to simplify it.

Risk Level: low.
Testing: n/a.
Docs Changes: n/a.
Release Notes: n/a.
Platform Specific Features: n/a.

Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @adisuissa
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #36099 was opened by wbpcode.

see: more, trace.

@wbpcode
Copy link
Member Author

wbpcode commented Sep 12, 2024

cc @tyxia I think you did similiar thing to the remote rate limit filter in the #18044 But you put that API under the extensions/filters/http/ratelimit, so I cannot reuse it. And still doesn't have the implementation.

Signed-off-by: wangbaiping <[email protected]>
@wbpcode
Copy link
Member Author

wbpcode commented Sep 13, 2024

/retest

Copy link
Contributor

@kyessenov kyessenov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this. We had an issue somewhere about de-coupling rate limit from core config -- this is the right direction IMO.

@wbpcode
Copy link
Member Author

wbpcode commented Sep 14, 2024

/retest

Copy link
Member

@tyxia tyxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks @wbpcode

Should http/ratelimit also just use the one you added in common/ratelimit in the future?

@wbpcode
Copy link
Member Author

wbpcode commented Sep 14, 2024

LGTM, Thanks @wbpcode

Should http/ratelimit also just use the one you added in common/ratelimit in the future?

I think it's okay. But note the current API of http/ratelimit had been added for a long time.
According to our API policy, we couldn't change it.

But I think we can treat it as special case. It's never be implemented by Envoy and I think no third party xds clients will use it.

Copy link
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this!
I agree that making the API easier to use is something we should strive for.
I do wonder whether this should be a different proto. WDYT?

@wbpcode
Copy link
Member Author

wbpcode commented Sep 16, 2024

Thanks for tackling this! I agree that making the API easier to use is something we should strive for. I do wonder whether this should be a different proto. WDYT?

Do you mean should we re-use the RateLimit in the route.v3? I initially think about it. But there some fields that will never be supported in the local rate limit. So, I finally create a new proto message. And will gradually to deprecate the previous Ratelimit's support in the local rate limit extensions.

@adisuissa
Copy link
Contributor

Do you mean should we re-use the RateLimit in the route.v3? I initially think about it. But there some fields that will never be supported in the local rate limit. So, I finally create a new proto message. And will gradually to deprecate the previous Ratelimit's support in the local rate limit extensions.

What I'm trying to understand is whether this should be added to the current local-rate-limit proto as you are currently suggesting, or should it be a new local-rate-limit.
This is highly dependent on the how the internal fields in the new RateLimitConfig config work well with the other fields in the LocalRateLimit config. If they don't align well together, I think it should be a configuration in a new local-rate-limiter (new proto), because sharing the config doesn't make sense in this case.

@wbpcode
Copy link
Member Author

wbpcode commented Sep 16, 2024

Do you mean should we re-use the RateLimit in the route.v3? I initially think about it. But there some fields that will never be supported in the local rate limit. So, I finally create a new proto message. And will gradually to deprecate the previous Ratelimit's support in the local rate limit extensions.

What I'm trying to understand is whether this should be added to the current local-rate-limit proto as you are currently suggesting, or should it be a new local-rate-limit.
This is highly dependent on the how the internal fields in the new RateLimitConfig config work well with the other fields in the LocalRateLimit config. If they don't align well together, I think it should be a configuration in a new local-rate-limiter (new proto), because sharing the config doesn't make sense in this case.

Oh, then I will prefer to place it in the extensions/common as a new proto. Because the config self is not local rate limit specific thing. It could be shared by other rate limit module in future, just like the @tyxia 's comment.

@wbpcode
Copy link
Member Author

wbpcode commented Sep 19, 2024

friendly ping @adisuissa

@wbpcode
Copy link
Member Author

wbpcode commented Sep 21, 2024

I found I mis-understood @adisuissa again :( sorry. The new rate_limits is used to generate the descriptor list which will replace the rate_limits (https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto#envoy-v3-api-field-config-route-v3-virtualhost-rate-limits) in the route and vhost.

By this way, the local rate limit filter could work without dependency to the core route configuration. And could resolve the problems in the PR description.

I completed the implementation. So, you could take a quick look at that to ensure it could work as expected.

Signed-off-by: wangbaiping <[email protected]>
@wbpcode wbpcode changed the title local rate limit: add new rate_limits api to the filter's api local rate limit: add new rate_limits support to the filter Sep 21, 2024
Signed-off-by: wangbaiping <[email protected]>
Signed-off-by: wangbaiping <[email protected]>
Signed-off-by: wangbaiping <[email protected]>
@wbpcode
Copy link
Member Author

wbpcode commented Sep 25, 2024

Hi, @mattklein123 , @adisuissa and me basically get agreement on the API. Could you take another look to both the API and implementation when you get some free time? Thanks.

@wbpcode
Copy link
Member Author

wbpcode commented Sep 26, 2024

/retest

@wbpcode
Copy link
Member Author

wbpcode commented Oct 1, 2024

Friendly ping @mattklein123 :)

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM at a high level. Few small comments.

/wait

Comment on lines 16 to 25
if (config.has_stage() || !config.disable_key().empty()) {
ENVOY_LOG(warn, "'stage' field and 'disable_key' field are not supported");
}

if (config.has_limit()) {
if (no_limit) {
ENVOY_LOG(warn, "'limit' field is only supported in filter that calls remote rate "
"limit service.");
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these return errors that lead to config rejection?

Copy link
Member Author

@wbpcode wbpcode Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuration rejection actually bring some risks in production env.

I agree we should reject the configuration if there are more serious problem like unknown factory for extension rate limit policy.

But specific to these fields, basically I inclined to flush warning log because these fields won't effect the feature

Feel free to let me know if you think this is a block point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People generally don't notice warnings. For this case I would make it a failure. People should be checking for config failures.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

Comment on lines 79 to 87
if (rate_limit_config_->empty()) {
if (!config.descriptors().empty()) {
ENVOY_LOG_FIRST_N(
warn, 20,
"'descriptors' are set for local rate limit filter but no 'rate_limits' "
"are configured in the filter config. Please use the 'rate_limits' field "
"in filter config to instead of the 'rate_limits' field in the route config.");
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this cause a config load failure?

Copy link
Member Author

@wbpcode wbpcode Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. This warning will notice the users to switch the rate_limits from route to the embedded rate_limits of filter config.

But considering the backward compatibility, before we removed the support to the route rate_limits completely in the filter, we shouldn't reject the configuration that without embedded rate_limits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then sorry I don't understand this warning. Can you make it more clear?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me have a try make it more clear.

Copy link
Member Author

@wbpcode wbpcode Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the log message. Hope it makes sense for you. The core points are:

  1. descriptors only makes sense when rate_limits (of filter config or route) is configured.
  2. Now, if the rate_limits in filter config is not set, may be it's set in the route. Please take the rate_limits of filter config as priority because we may remove the support of rate_limits of route config from the local rate limit filter in the future.

@wbpcode
Copy link
Member Author

wbpcode commented Oct 3, 2024

/retest

1 similar comment
@wbpcode
Copy link
Member Author

wbpcode commented Oct 3, 2024

/retest

Comment on lines 79 to 89
if (rate_limit_config_->empty()) {
if (!config.descriptors().empty()) {
ENVOY_LOG_FIRST_N(
warn, 20,
"'descriptors' is set but no valid 'rate_limits' in LocalRateLimit config. Note that "
"'descriptors' only makes sense when 'rate_limits' in LocalRateLimit config or route "
"config is specified. And please take the 'rate_limits' in the LocalRateLimit config as "
"priority because the 'rate_limits' in the route config may be ignored by the filter in "
"the future.");
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being dense but I still don't understand why this has to be a runtime failure? This is a new feature, right? If the user configured it incorrectly can we fail to load the config?

/wait-any

Copy link
Member Author

@wbpcode wbpcode Oct 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backward compatibility.

Note, in the previous local ratelimit, the local ratelimit's descriptors field will be used to declare the rate limit config of specific descriptor, like 10 requests per second for descriptor (key1, value1), etc.

And the rate_limits in the route configuration will be used to generate specific descriptor (key, value) of request and the descriptor will be used to match a config in the descriptors field (The naming is a little confusing because the historical reason.)

But previous implementation has some drawbacks in actual practices (see PR description), so I added the new rate_limits field in the filter config directly as an alternative and resolve previous drawbacks.

But we cannot reject the filter config which doesn't contains the new rate_limits, because the users may have configured rate_limits in the route.

@wbpcode
Copy link
Member Author

wbpcode commented Oct 5, 2024

/retest

@repokitteh-read-only repokitteh-read-only bot removed the api label Oct 5, 2024
@wbpcode wbpcode merged commit e486663 into envoyproxy:main Oct 5, 2024
22 checks passed
@wbpcode wbpcode deleted the dev-local-rate-limit-api branch October 5, 2024 23:27
Stevenjin8 pushed a commit to Stevenjin8/envoy that referenced this pull request Oct 10, 2024
…xy#36099)

Commit Message: local rate limit: add new rate_limits api to the
filter's api
Additional Description:

In the previous local rate limit, the
[rate_limits](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto#envoy-v3-api-field-config-route-v3-virtualhost-rate-limits)
field of route is used to generate the descriptor entries. Then the
generated entries will be used to match a token bucket which is
configured in the filter configs (route level, vhost level, etc).

However, it make the configuration very complex, and cannot cover some
common scenarios easily. For example, give a specific virtual host X and
a special route Y that under this virtual host X.

We want to provides a virtual host level rate limit for the specific
virtual host X, and a route level rate limit for the specific route Y.
We hope the configuration of virtual host could works for all routes
except the Y.

For most filters, this requirement could be achieved by getting the most
specific filter config and applying it. But for the local rate limit,
thing become very complex. Because the rate limit configuration is split
into `rate_limits` field of route and the filter config. The local rate
limit need to handle these relationship carefully.

This PR try to simplify it.

Risk Level: low.
Testing: n/a.
Docs Changes: n/a.
Release Notes: n/a.
Platform Specific Features: n/a.

---------

Signed-off-by: wangbaiping <[email protected]>
Signed-off-by: code <[email protected]>
Co-authored-by: Matt Klein <[email protected]>
Signed-off-by: Steven Jin Xuan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants