[Search Perf] Query Frontend Caching #2470

joe-elliott · 2023-05-12T14:34:28Z

As more users are starting to embed TraceQL queries in dashboards it has become apparent that we need to improve our caching to handle repeated queries with slightly adjusted time ranges (e.g. auto refreshing dashboards). Currently we only cache parquet footers and bloom filters.

Let's add a cache at the query-frontend at the individual "job" level. After a query is broken into a stream of jobs we will cache based on the individual job url. This takes into account the query, block id, row groups, etc. For a given job the results are immutable b/c the blocks don't change. So if we have previously executed a query we can expect the results to be the same.

Caveats:

We can only rely on cache if the start/end time ranges completely encapsulate the block. Use the metadata to determine this. If the start/end overlap the block we have to issue the job to the queriers b/c cache can't be trusted.
Start/end time ranges need to be stripped from the url before hashing for cache. This way as dashboard slowly moves across a time range we will generally be pulling from cache for most blocks and only issues requests to the queriers for blocks on the edges of the time ranges and new blocks created by compactors.

github-actions · 2023-07-12T00:03:46Z

This issue has been automatically marked as stale because it has not had any activity in the past 60 days.
The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity.
Please apply keepalive label to exempt this Issue.

github-actions · 2023-10-12T00:03:36Z

This issue has been automatically marked as stale because it has not had any activity in the past 60 days.
The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity.
Please apply keepalive label to exempt this Issue.

joe-elliott · 2024-01-31T21:11:46Z

Completed with #3225

joe-elliott added the type/performance label May 12, 2023

github-actions bot added the stale Used for stale issues / PRs label Jul 12, 2023

joe-elliott added keepalive Label to exempt Issues / PRs from stale workflow and removed stale Used for stale issues / PRs labels Jul 25, 2023

github-actions bot added the stale Used for stale issues / PRs label Oct 12, 2023

joe-elliott removed the stale Used for stale issues / PRs label Oct 12, 2023

electron0zero added the area/query label Jan 31, 2024

joe-elliott closed this as completed Jan 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Search Perf] Query Frontend Caching #2470

[Search Perf] Query Frontend Caching #2470

joe-elliott commented May 12, 2023

github-actions bot commented Jul 12, 2023

github-actions bot commented Oct 12, 2023

joe-elliott commented Jan 31, 2024

[Search Perf] Query Frontend Caching #2470

[Search Perf] Query Frontend Caching #2470

Comments

joe-elliott commented May 12, 2023

github-actions bot commented Jul 12, 2023

github-actions bot commented Oct 12, 2023

joe-elliott commented Jan 31, 2024