Improve the performance and latency for the cluster limiter #3382

absolute8511 · 2024-04-23T12:21:46Z

Issue Description

Type: feature request

Describe what feature you want

By default, the cluster limiter will have to request to token server for each request. This will increase the latency and make
too much requests to the token server. In order to solve the performance issue under cluster, the local token cache should be a proposal for this.

Describe your initial design (if present)

In order to reduce the request to server, we add a background prefetch job to period check the tokens and prefetch a batch of tokens if necessary. While the user request coming it will first check the local tokens.

Some design summary as below:

only small request will be cached to reduce the pressure to the token server
we allow request more than cached in some cases, and if prefetch failed for a while, most should be fall back to local limiter.

Additional context

absolute8511 linked a pull request Apr 23, 2024 that will close this issue

feat: add local token cache for cluster #3381

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the performance and latency for the cluster limiter #3382

Improve the performance and latency for the cluster limiter #3382

absolute8511 commented Apr 23, 2024 •

edited

Loading

Improve the performance and latency for the cluster limiter #3382

Improve the performance and latency for the cluster limiter #3382

Comments

absolute8511 commented Apr 23, 2024 • edited Loading

Issue Description

Describe what feature you want

Describe your initial design (if present)

Additional context

absolute8511 commented Apr 23, 2024 •

edited

Loading