Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store-gateway: high memory allocations caused by per-tenant Prometheus registry #102

Open
grafanabot opened this issue Aug 10, 2021 · 2 comments

Comments

@grafanabot
Copy link
Contributor

Describe the bug
To be able to use Thanos BucketStore while supporting Cortex multi-tenancy we need to create a BucketStore for each tenant, passing a dedicated Prometheus registry to each one and then aggregate metrics from all registries.

Due to this, the Prometheus metrics collection causes high memory allocations (order of 50MB/s in a store-gateway with 7.5K tenants). Allocated memory is not retained, but still puts pressure on GC.

Screenshot 2021-01-07 at 11 35 27

In a cluster with low QPS, 95% store-gateway memory allocations are caused by metrics collecting.

Submitted by: pracucci
Cortex Issue Number: 3697

@grafanabot
Copy link
Contributor Author

Enabling shuffle-sharding on store-gateway significantly improve this.

Submitted by: pracucci

@pracucci
Copy link
Collaborator

More data points from a store-gateway loading blocks from 13k tenants.

CPU

Screenshot 2021-08-11 at 17 05 17

Memory allocations (bytes)

Screenshot 2021-08-11 at 17 06 08

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants