Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache deletion mark files together with meta files. #2457

Closed
wants to merge 14 commits into from
Closed

Cache deletion mark files together with meta files. #2457

wants to merge 14 commits into from

Conversation

pstibrany
Copy link
Contributor

Changes

This PR adds caching of deletion mark files, similar to how meta.json files are cached. Deletion mark files are stored at the same place.

In order to make this work, deletion mark files are fetched at the same time as meta.json files. They are fetched for each block, stored in a map and on the disk (if they exist), and then passed to filters.

Open questions:

  • is this a viable approach?
  • do we want to avoid fetching deletion mark files, if no filter uses them? We can do that for example by adding RequiresDeletionMarks() bool function to Filter interface, to figure out if we need them, and passing the OR-ed result from all filters to BaseFetcher.

I personally think this idea can be extended so that Fetcher returns mark files as well. Compactor could use that information, instead of relying on ignoreDeletionMarkFilter like it does today. But that would be a different PR.

Verification

Not tested yet, and unit tests are broken. I will continue working on tests once I know this is a valid approach.

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

pkg/block/fetcher.go Outdated Show resolved Hide resolved
@bwplotka
Copy link
Member

As commented above, but also:

do we want to avoid fetching deletion mark files, if no filter uses them? We can do that for example by adding RequiresDeletionMarks() bool function to Filter interface, to figure out if we need them, and passing the OR-ed result from all filters to BaseFetcher.

I would say we always have to fetch, that's why I would think rather deletion marks should be done before fitlers even

@bwplotka
Copy link
Member

Otherwise make sense, I guess once we see deletion mark we cache it and no longer check for it right?

@bwplotka
Copy link
Member

bwplotka commented Apr 17, 2020

question is: Do we even need to delete some deletion mark to undo deletion

@pstibrany
Copy link
Contributor Author

I would say we always have to fetch, that's why I would think rather deletion marks should be done before fitlers even

That's exactly what the code does now (https://github.com/pstibrany/thanos/blob/3a2276aa182a918e888413c0cac3b088788a87fc/pkg/block/fetcher.go#L377).

However, there are some places in UI where no filters are used, and I was thinking that perhaps in those cases we don't need to fetch deletion mark files.

@pstibrany
Copy link
Contributor Author

Otherwise make sense, I guess once we see deletion mark we cache it and no longer check for it right?

Similar to meta.json, we only check if it still exists on the storage, but don't read it anymore. If it does, we use local cache. If it doesn't, we report that mark file no longer exist (even if we have it locally).

https://github.com/pstibrany/thanos/blob/3a2276aa182a918e888413c0cac3b088788a87fc/pkg/block/fetcher.go#L280

@pstibrany
Copy link
Contributor Author

question is: Do we even need to delete some deletion mark to undo deletion

This would still work! loadDeletionMark verifies whether deletion mark file exist on remote storage, before reusing cached copy

@pstibrany
Copy link
Contributor Author

This would still work! loadDeletionMark verifies whether deletion mark file exist on remote storage, before reusing cached copy

Which in retrospect doesn't save that much, since before this PR it's one operation to read the mark file, after this PR it's one operation to check if it exists (+ another one to read if, it it's not yet cached). 🤔

@bwplotka
Copy link
Member

Exactly. I think with deletion marks we can go much further. We don't expect them to be changed ever. So we can cache once and always ignore such block. For readers they can ignore also meta.json after ignore time (:

@bwplotka
Copy link
Member

As agree offline I will try to embed this code into fix for #2459

@bwplotka
Copy link
Member

Maybe in two steps

@bwplotka
Copy link
Member

Actually let's have quick fix for v0.12.1 maybe.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Added caching for non-existant markers as well.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
@pstibrany
Copy link
Contributor Author

BaseFetcher now caches both positive and negative (non-existant marks) results for loading deletion marks. Cached entries will be returned without any rechecking for some time (existing marks are returned for ~1h, non-existing marks for ~5m). Next check time is randomized to avoid rechecking all deletion marks at the same time.

Cache entry time-to-live durations are currently hardcoded, but can be made configurable if required.

@pstibrany pstibrany marked this pull request as ready for review April 20, 2020 14:39
…Aaaarrrrggggghhh. 😱

Signed-off-by: Peter Štibraný <[email protected]>
Fix unit test by flushing the cache.

Signed-off-by: Peter Štibraný <[email protected]>
@pstibrany pstibrany marked this pull request as draft April 20, 2020 16:34
@pstibrany
Copy link
Contributor Author

Back to draft... I've added caching of negative results too, but that has some consequences that I'd rather not tackle in this PR. I will remove that and then un-draft it.

@bwplotka
Copy link
Member

negative result?

@pstibrany
Copy link
Contributor Author

negative result?

When deletion marker doesn’t exist on the remote storage, my PR was caching that info for a short while.

@pstibrany pstibrany marked this pull request as ready for review April 21, 2020 07:34
@pstibrany
Copy link
Contributor Author

BaseFetcher now caches both positive and negative (non-existant marks) results for loading deletion marks. Cached entries will be returned without any rechecking for some time (existing marks are returned for ~1h, non-existing marks for ~5m). Next check time is randomized to avoid rechecking all deletion marks at the same time.

Cache entry time-to-live durations are currently hardcoded, but can be made configurable if required.

BaseFetcher now only caches positive results for ~1h. Next check time is still randomized, but TTL is hard-coded.

Signed-off-by: Peter Štibraný <[email protected]>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job @pstibrany. The overall design LGTM and passing the deletion marks to Filter() looks good for future extensibility from my perspective. I left few comments here and there, all small things.

pkg/block/metadata/deletionmark.go Outdated Show resolved Hide resolved
pkg/block/fetcher.go Outdated Show resolved Hide resolved
pkg/block/fetcher.go Outdated Show resolved Hide resolved
pkg/block/fetcher.go Outdated Show resolved Hide resolved
pkg/block/fetcher.go Show resolved Hide resolved
pkg/block/fetcher.go Outdated Show resolved Hide resolved
pkg/block/fetcher.go Show resolved Hide resolved
pkg/block/fetcher.go Outdated Show resolved Hide resolved
ttl := f.deletionMarkPositiveCacheEntryTTL

return &cachedDeletionMark{
nextCheck: now.Add(ttl/2 + time.Duration(rand.Int63n(ttl.Nanoseconds()))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the effective one is between ttl/2 and ttl*1.5. Is it what we want right? I would have expected deletionMarkPositiveCacheEntryTTL to be the upper bound, not the average TTL, but I don't have strong opinions as far as we document it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly wanted to avoid doing all rechecks at the same time, but to spread them a little. You're right that TTL could be made upper-bound instead, and then we can compute effective TTL for example as random between [TTL/2, TTL]. I also don't have strong opinions on this. I think the naming is bit confusing here.

Copy link
Contributor

@pracucci pracucci Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think could be made a bit more clear if we would have two hardcoded values: min TTL and max TTL. Then the actual TTL is a random between min and max. Alternatively, TTL + jitter. But making it explicit would help to clarify it in the code.

pkg/block/fetcher.go Outdated Show resolved Hide resolved
Signed-off-by: Peter Štibraný <[email protected]>
@pstibrany
Copy link
Contributor Author

Thanks for review @pracucci, I've addressed your feedback.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job @pstibrany. Changes LGTM. I left few last comments / questions.

func (r response) deletionMarksCopyForFilter() map[ulid.ULID]*metadata.DeletionMark {
marks := make(map[ulid.ULID]*metadata.DeletionMark, len(r.marks))
for id, m := range r.marks {
marks[id] = &m.mark
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I commented about storing by copy/reference in cachedDeletionMark is due to the factor that here we work with pointers, while in the cache we store by copy. It just looked weird to me, but I'm not feeling strong about this.

Comment on lines +421 to 423
if metaErr == nil {
mark, markErr = f.loadDeletionMark(ctx, id, now)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking loudly: the contract of this fetchMetadata() function is that it returns found metas also in case of any error fetching any meta (so it's OK returning a partial view). I'm wondering if it's correct to never fetch the deletion mark in case of any error: ie. should we fetch it anyway in case of ErrorSyncMetaCorrupted? I don't have an answer honestly, but I would like to have a discussion on it.

}
f.cached = cached
f.cached = resp.metasCopy()
f.marks = resp.marks // no need to copy, as it's not going to be modified
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given deletion marks are relatively small (compared to metas), could just be more future-proof always copying them?

}

// deletionMarksCopyForFilter makes a copy of deletion marks map, suitable for passing to Filter method.
func (r response) deletionMarksCopyForFilter() map[ulid.ULID]*metadata.DeletionMark {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a bit weird to me calling it ForFilter(). The usage we're going to do with a copy of deletion marks shouldn't be reflected in the function name IMO.

expectedMetas: ULIDs(1, 3, 6),
expectedCorruptedMeta: ULIDs(5),
expectedNoMeta: ULIDs(4, 2),
expectedMetaErr: errors.New("incomplete view: unexpected meta file: 00000000070000000000000000/meta.json version: 20"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also expect an error for the corrupted deletion mark?

@pstibrany
Copy link
Contributor Author

We have deployed updated Cortex with this PR in today, and results are somewhat disappointing...

Screenshot 2020-04-22 at 10 00 55

If we focus on get and exists operations only, we can see that we now issue even more operations to the storage:

Screenshot 2020-04-22 at 10 02 20

It seems that reason for this is a combination of several factors:

  • New store-gateway component in scanning all blocks, and uses metadata filter that removes "unwanted" blocks
  • We now fetch deletion marks before filters are run. At the same time, in this PR we don't cache non-existant marks

Previously deletion marks were only fetched for blocks that passed "is this block interesting" filter, which resulted in fewer operations.

@pstibrany
Copy link
Contributor Author

In the light of these findings, let's not merge this yet. I'd like to write a design document to suggest more generic caching approach around storage.

@pstibrany pstibrany closed this Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants