Improve Polling Performance at high block counts #2521

joe-elliott · 2023-05-31T13:13:00Z

As Tempo approaches 1M+ blocks the TCO and performance of polling becomes negatively impactful. Polling operates using the following steps.

Queriers and query-frontends:

List all tenants
Get <tenant>/index.json.gz for each tenant

Compactors:

List all tenants
If I am not the tenant index builder for this tenant
- Get <tenant>/index.json.gz for the tenant
If I am the tenant index builder for this tenant
- List all blocks beneath the tenant: <tenant>/*
- For each block get <tenant>/<guid>/meta.json. If this 404s get <tenant>/<guid>/meta.compacted.json
- Use the above information to build a new tenant index
- Put <tenant>/index.json.gz

The cost in time and api calls of getting all of the meta.json's and meta.compacted.json's adds up considerably. Given that meta.json and meta.compacted.json are immutable we don't need to GET them individually and we need to find a way to reduce these calls.

The text was updated successfully, but these errors were encountered:

joe-elliott · 2023-05-31T13:34:25Z

One possible solution is to find a way with each backend to list the meta and compacted meta's directly. If this is possible then step 3 above would look like:

If I am the tenant index builder for this tenant
- List all meta.jsons for this tenant: <tenant>/*/meta.json
- List all compacted meta.jsons for this tenant: <tenant>/*/meta.compacted.json
- Compare the above lists to the current list in <tenant>/index.json.gz
- Update and PUT the new <tenant>/index.json.gz based on the above lists. This will require GETing only new metas and compacted metas.

Backend support:

Local storage can fall back to the old behavior. This does not need support to scaling to 1M+.
GCS has direct support
I believe that the s3 list call is already returning these objects and we are filtering down to blocks by using CommonPrefixes. This may simply require a change to our own list operation to support "wildcard" support.
Something similar may be possible with Azure. We seem to have similar functionality in the list call. If we have to fall back to the old behavior for azure I would be ok with that. This change is more critical for s3/gcs for us.

mdisibio · 2023-06-06T13:34:47Z

Another possible solution: when a compactor rebuilds an index it already has the blocklist in memory from 5 minutes ago (default polling interval), eliminate work by not re-reading the meta for blocks it already has. Get the new list of block IDs (1 backend List call), compare with the previous list to detect new and deleted blocks. When a block is compacted (meta.json -> meta.compacted.json) this wouldn't show up as a delta in the list of block IDs, but we can solve it by recording the obsoleted block IDs in the meta.json for the new block. Then when the new block is detected and we read the meta, we know which old blocks are now compacted, and can update accordingly.

Review of steps:

On first poll resort to original behavior (full scan)
On subsequent call: list block IDs for tenant
Compare with existing list in-memory. New entries means new block. Missing entry means deleted block.
Read meta for new blocks.
Meta of new blocks contains a new field compactedBlocks which are the block IDs that were used to create this new block. Update those entries from "live" to "compacted".
If an error occurs fallback to original behavior (full scan)

This approach doesn't require any new behavior from the backend and maintains current compatibility. There may be some more edge cases to think about but overall is promising.

zalegrala · 2023-06-30T22:10:22Z

I like the idea of reducing load on the API calls by reusing the data from the last poll. Currently the poller doesn't have the details of the last poll, but passing in the last blocklist to the poller seems like it would work to perform the comparison of new blocks to old blocks.

I'm not yet clear on where to make adjustments to include a new field compactedBlocks on the blockMeta. I'd like to see where in the code we might slot that in. In the encoding interfaces, I see where new blocks are created, but I don't see the combining of multiple blocks into one during compaction so that we know the history block list. If I understand the suggestion correctly, during compaction, we should be able to take the list of block ids that we are combining into one block to record on the newly compacted block meta. Then when we list all the new blocks, we'll need to read the meta (and the compacted meta) to identify the blocks in the previous polling list that can be removed from the block list we are about to write. Does that sound workable?

I'll continue to dig on the compaction for the combining logic to see where we might make a change.

joe-elliott · 2023-07-05T12:27:53Z

but passing in the last blocklist to the poller seems like it would work to perform the comparison of new blocks to old blocks.

The previous blocklist can't be trusted b/c any given meta.json may have been moved to compacted. Also, any meta.compacted.json may have been deleted. This is why I was looking for a way to get this information from the the list operation instead of individual GETs.

I'm not yet clear on where to make adjustments to include a new field compactedBlocks on the blockMeta.

I'd prefer looking at using the list operation first. It requires less state management and code changes if it works.

zalegrala · 2023-07-07T16:38:52Z

I think the modifications to the List() operation make sense, or perhaps a ListWildcard, but would prefer to amend the existing interface. I think I'm good to proceed on this with the suggestions and would like to get some thoughts down today so I can hit the ground running next week. I think azure might also support wildcard.

I wanted to explore Marty's idea on the issue, but after reviewing the Compact() methods in the encoders, I realize there is lots of complexity in there and changes would need to be made to each encoder in order to know which source IDs were used to create the new block.

zalegrala · 2023-07-14T19:55:07Z

I've got #2652 for my work in progress.

I've got this deployed in an environment with about 125k blocks and a couple thousand tenants, concurrence turned off and compaction disabled. It looks like the blocklist poll duration actually goes up since the list call is now doing more work. We still save on the Get calls, but overall duration seems worse. I'll revert to latest in this environment and observe it for a few hours to confirm.

I suspect that this may still be acceptable due to the reduction in get calls for the meta files.

zalegrala · 2023-08-23T21:08:21Z

I've been making progress on #2652 over the last couple of weeks. Joe and I had a chat last week about paralleling the effort and so I've made some adjustments to include that, but in the process broke a bunch of tests that I've been chasing down since. I'm now back to the point where I have an image running for testing, and just working on the last couple of items to get the e2e tests to pass. I believe this is nearly ready for another review.

I'm expecting one point of contention in the review that perhaps we can discuss as a team, which is the inclusion of a HasFeature() method on the RawReader. Koenraad mentions this might also be useful for checking which backend supports versioning, and so maybe it won't be an issue, but would like to get some thoughts on this.

zalegrala · 2023-09-05T14:54:00Z

We've reached some disagreement about how to proceed. I'm working up a document to put in front of the team to gather feedback and ideally an agreement about how to move forward.

amurray2306 · 2023-10-06T09:11:36Z

Hey @zalegrala any updates on this issue? Its related to a PIR action so just want to check if anything has moved forward.

amurray2306 · 2023-10-06T09:12:58Z

Also quick question I'm recording the priority of issues relating to issues created from PIRs. What would you have this at? High Medium or low

amurray2306 · 2023-10-25T13:33:54Z

@zalegrala did you reach a consensus on this since our last ping? Did you manage to put the doc together your mentioned?

zalegrala · 2023-10-26T21:04:15Z

@amurray2306 Apologies, I've not done a great job about keeping this up to date. I'm also perpetually behind on github notifications. This is my quarterly deliverable and expect it to ship soon. I'm not sure how this slots in to your efforts, so let me know what I can help with.

At the mention of a doc, we opted for a meeting instead where we discussed the approaches and came to some agreement about how to proceed. Since then, the image has been tested and run in our ops environment with great success. I believe we are close to finalizing the PR here, with the last few iterations of review mostly about test coverage and style.

amurray2306 · 2023-10-27T08:33:27Z

Awesome news congrats :D. Thanks for the context ill update my notes

zalegrala · 2023-10-27T19:52:00Z

The PR for this work has been merged and will roll out to our environments in the next couple weeks.

zalegrala self-assigned this Jun 29, 2023

joe-elliott added the operations label Jul 25, 2023

zalegrala mentioned this issue Jul 27, 2023

Implement polling improvements #2652

Merged

3 tasks

amurray2306 added the postmortem label Aug 29, 2023

mapno removed the postmortem label Oct 6, 2023

zalegrala closed this as completed in #2652 Oct 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Polling Performance at high block counts #2521

Improve Polling Performance at high block counts #2521

joe-elliott commented May 31, 2023 •

edited

Loading

joe-elliott commented May 31, 2023

mdisibio commented Jun 6, 2023

zalegrala commented Jun 30, 2023 •

edited

Loading

joe-elliott commented Jul 5, 2023

zalegrala commented Jul 7, 2023

zalegrala commented Jul 14, 2023 •

edited

Loading

zalegrala commented Aug 23, 2023

zalegrala commented Sep 5, 2023

amurray2306 commented Oct 6, 2023

amurray2306 commented Oct 6, 2023

amurray2306 commented Oct 25, 2023

zalegrala commented Oct 26, 2023 •

edited

Loading

amurray2306 commented Oct 27, 2023

zalegrala commented Oct 27, 2023

Improve Polling Performance at high block counts #2521

Improve Polling Performance at high block counts #2521

Comments

joe-elliott commented May 31, 2023 • edited Loading

joe-elliott commented May 31, 2023

mdisibio commented Jun 6, 2023

zalegrala commented Jun 30, 2023 • edited Loading

joe-elliott commented Jul 5, 2023

zalegrala commented Jul 7, 2023

zalegrala commented Jul 14, 2023 • edited Loading

zalegrala commented Aug 23, 2023

zalegrala commented Sep 5, 2023

amurray2306 commented Oct 6, 2023

amurray2306 commented Oct 6, 2023

amurray2306 commented Oct 25, 2023

zalegrala commented Oct 26, 2023 • edited Loading

amurray2306 commented Oct 27, 2023

zalegrala commented Oct 27, 2023

joe-elliott commented May 31, 2023 •

edited

Loading

zalegrala commented Jun 30, 2023 •

edited

Loading

zalegrala commented Jul 14, 2023 •

edited

Loading

zalegrala commented Oct 26, 2023 •

edited

Loading