Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failing SearchTagValues endpoint after startup #1813

Merged
merged 6 commits into from
Oct 20, 2022

Conversation

stoewer
Copy link
Contributor

@stoewer stoewer commented Oct 19, 2022

What this PR does:
The bug described in #1792 happened because rediscoverLocalBlocks created BackendSearchBlock objects for WAL blocks, even when they didn't contain the files required for flatbuffer search (search-header etc.). Later calls of TagValues() on the search blocks resulted in the logged error message and caused the /tempopb.Querier/SearchTagValues endpoint to fail.

This PR skips the creation of BackendSearchBlock for WAL blocks that do not contain the flatbuffer search files.

Which issue(s) this PR fixes:
Fixes #1792

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@stoewer stoewer changed the title Fix failing endpoint SearchTagValues after startup Fix failing SearchTagValues endpoint after startup Oct 19, 2022
@stoewer stoewer marked this pull request as ready for review October 19, 2022 09:15
Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn on this. This is a very correct way to fix this problem, but making changes to the backend interfaces for something I view as temporary is tough. In 2.0 we will drop support for search over the old block format ('v2') and all of this code goes away.

In OpenBackendSearchBlock() we could attempt to call ReadRange() for 1 byte on the header. If that returns "does not exist" then we return ErrSearchNotSupported. I know this is far "hackier" then then the proposed solution, but it will be nicely cleaned up when we delete the rest of this code.

objects = append(objects, f.Name())
objects := make([]string, 0, len(dirEntries))
for _, d := range dirEntries {
objects = append(objects, d.Name())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was changed to support the Has() method? I'm concerned about the impact on other uses of List()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. From the signature alone it is not clear whether List is supposed to return only directory names. The code where List is used (raw.go#L158 or raw.go#L139) seems to anticipate that filenames are listed too. However, the s3 backend also only returns directories / prefixes.

@@ -113,6 +113,19 @@ func NewReader(r RawReader) Reader {
}
}

func (r *reader) Has(ctx context.Context, name string, blockID uuid.UUID, tenantID string) (bool, error) {
objects, err := r.r.List(ctx, KeyPathForBlock(blockID, tenantID))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea if this works for all backends :). We only really need it to work for local, but that's not ideal.

@mdisibio
Copy link
Contributor

making changes to the backend interfaces

I also lean towards not changing the backend interface. Tempo is run on a many backends, both the popular clouds but also api-compatible vendor solutions that we have no way to test. Traditionally we have kept Tempo's usage as basic as possible to improve compatibility. I don't necessarily see an issue with adding Has here since it is constrained to the ingester and local backend, but would have concerns if it had wider usage throughout Tempo. Therefore I would be ok with @joe-elliott 's alternate method (reading 1 byte).

The List method of the local backend now returns both directory and
regular file names instead of returning only directroy names
This resolves a bug in SearchTagValues where the method TagValues
was called on blocks without a search-header and search-index file
Implement isSearchSupported using ReadRange instead of Has

Remove Has method from backend and revert changes in List method
@stoewer
Copy link
Contributor Author

stoewer commented Oct 19, 2022

Thanks for the review. I removed List method and implemented the check using ReadRange as suggested by @joe-elliott.

@joe-elliott joe-elliott merged commit 8e37758 into grafana:main Oct 20, 2022
@stoewer stoewer deleted the ingester-restart-error branch October 20, 2022 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ingesters not registering RPC endpoints on startup
3 participants