Cache deletion mark files together with meta files. #2457

pstibrany · 2020-04-17T15:17:26Z

Changes

This PR adds caching of deletion mark files, similar to how meta.json files are cached. Deletion mark files are stored at the same place.

In order to make this work, deletion mark files are fetched at the same time as meta.json files. They are fetched for each block, stored in a map and on the disk (if they exist), and then passed to filters.

Open questions:

is this a viable approach?
do we want to avoid fetching deletion mark files, if no filter uses them? We can do that for example by adding RequiresDeletionMarks() bool function to Filter interface, to figure out if we need them, and passing the OR-ed result from all filters to BaseFetcher.

I personally think this idea can be extended so that Fetcher returns mark files as well. Compactor could use that information, instead of relying on ignoreDeletionMarkFilter like it does today. But that would be a different PR.

Verification

Not tested yet, and unit tests are broken. I will continue working on tests once I know this is a valid approach.

I added CHANGELOG entry for this change.
Change is not relevant to the end user.

Signed-off-by: Peter Štibraný <[email protected]>

pkg/block/fetcher.go

bwplotka · 2020-04-17T15:30:33Z

As commented above, but also:

do we want to avoid fetching deletion mark files, if no filter uses them? We can do that for example by adding RequiresDeletionMarks() bool function to Filter interface, to figure out if we need them, and passing the OR-ed result from all filters to BaseFetcher.

I would say we always have to fetch, that's why I would think rather deletion marks should be done before fitlers even

bwplotka · 2020-04-17T15:31:48Z

Otherwise make sense, I guess once we see deletion mark we cache it and no longer check for it right?

bwplotka · 2020-04-17T15:32:20Z

question is: Do we even need to delete some deletion mark to undo deletion

pstibrany · 2020-04-17T15:32:56Z

I would say we always have to fetch, that's why I would think rather deletion marks should be done before fitlers even

That's exactly what the code does now (https://github.com/pstibrany/thanos/blob/3a2276aa182a918e888413c0cac3b088788a87fc/pkg/block/fetcher.go#L377).

However, there are some places in UI where no filters are used, and I was thinking that perhaps in those cases we don't need to fetch deletion mark files.

pstibrany · 2020-04-17T15:34:19Z

Otherwise make sense, I guess once we see deletion mark we cache it and no longer check for it right?

Similar to meta.json, we only check if it still exists on the storage, but don't read it anymore. If it does, we use local cache. If it doesn't, we report that mark file no longer exist (even if we have it locally).

https://github.com/pstibrany/thanos/blob/3a2276aa182a918e888413c0cac3b088788a87fc/pkg/block/fetcher.go#L280

pstibrany · 2020-04-17T15:36:31Z

question is: Do we even need to delete some deletion mark to undo deletion

This would still work! loadDeletionMark verifies whether deletion mark file exist on remote storage, before reusing cached copy

pstibrany · 2020-04-17T15:39:17Z

This would still work! loadDeletionMark verifies whether deletion mark file exist on remote storage, before reusing cached copy

Which in retrospect doesn't save that much, since before this PR it's one operation to read the mark file, after this PR it's one operation to check if it exists (+ another one to read if, it it's not yet cached). 🤔

bwplotka · 2020-04-17T15:44:16Z

Exactly. I think with deletion marks we can go much further. We don't expect them to be changed ever. So we can cache once and always ignore such block. For readers they can ignore also meta.json after ignore time (:

bwplotka · 2020-04-17T17:35:32Z

As agree offline I will try to embed this code into fix for #2459

bwplotka · 2020-04-17T17:35:43Z

Maybe in two steps

bwplotka · 2020-04-17T17:37:58Z

Actually let's have quick fix for v0.12.1 maybe.

Signed-off-by: Peter Štibraný <[email protected]>

Added caching for non-existant markers as well. Signed-off-by: Peter Štibraný <[email protected]>

Signed-off-by: Peter Štibraný <[email protected]>

pstibrany · 2020-04-20T14:39:24Z

BaseFetcher now caches both positive and negative (non-existant marks) results for loading deletion marks. Cached entries will be returned without any rechecking for some time (existing marks are returned for ~1h, non-existing marks for ~5m). Next check time is randomized to avoid rechecking all deletion marks at the same time.

Cache entry time-to-live durations are currently hardcoded, but can be made configurable if required.

…Aaaarrrrggggghhh. 😱 Signed-off-by: Peter Štibraný <[email protected]>

Signed-off-by: Peter Štibraný <[email protected]>

Fix unit test by flushing the cache. Signed-off-by: Peter Štibraný <[email protected]>

Signed-off-by: Peter Štibraný <[email protected]>

pstibrany · 2020-04-20T16:35:49Z

Back to draft... I've added caching of negative results too, but that has some consequences that I'd rather not tackle in this PR. I will remove that and then un-draft it.

bwplotka · 2020-04-20T17:27:37Z

negative result?

pstibrany · 2020-04-20T18:09:21Z

negative result?

When deletion marker doesn’t exist on the remote storage, my PR was caching that info for a short while.

Signed-off-by: Peter Štibraný <[email protected]>

pstibrany · 2020-04-21T07:35:37Z

BaseFetcher now caches both positive and negative (non-existant marks) results for loading deletion marks. Cached entries will be returned without any rechecking for some time (existing marks are returned for ~1h, non-existing marks for ~5m). Next check time is randomized to avoid rechecking all deletion marks at the same time.

Cache entry time-to-live durations are currently hardcoded, but can be made configurable if required.

BaseFetcher now only caches positive results for ~1h. Next check time is still randomized, but TTL is hard-coded.

Signed-off-by: Peter Štibraný <[email protected]>

pracucci

Good job @pstibrany. The overall design LGTM and passing the deletion marks to Filter() looks good for future extensibility from my perspective. I left few comments here and there, all small things.

pkg/block/metadata/deletionmark.go

pkg/block/fetcher.go

pracucci · 2020-04-21T09:08:46Z

pkg/block/fetcher.go

+	ttl := f.deletionMarkPositiveCacheEntryTTL
+
+	return &cachedDeletionMark{
+		nextCheck: now.Add(ttl/2 + time.Duration(rand.Int63n(ttl.Nanoseconds()))),


So the effective one is between ttl/2 and ttl*1.5. Is it what we want right? I would have expected deletionMarkPositiveCacheEntryTTL to be the upper bound, not the average TTL, but I don't have strong opinions as far as we document it.

I mostly wanted to avoid doing all rechecks at the same time, but to spread them a little. You're right that TTL could be made upper-bound instead, and then we can compute effective TTL for example as random between [TTL/2, TTL]. I also don't have strong opinions on this. I think the naming is bit confusing here.

I think could be made a bit more clear if we would have two hardcoded values: min TTL and max TTL. Then the actual TTL is a random between min and max. Alternatively, TTL + jitter. But making it explicit would help to clarify it in the code.

pkg/block/fetcher.go

Signed-off-by: Peter Štibraný <[email protected]>

pstibrany · 2020-04-21T09:53:10Z

Thanks for review @pracucci, I've addressed your feedback.

pracucci

Good job @pstibrany. Changes LGTM. I left few last comments / questions.

pracucci · 2020-04-22T07:11:12Z

pkg/block/fetcher.go

+func (r response) deletionMarksCopyForFilter() map[ulid.ULID]*metadata.DeletionMark {
+	marks := make(map[ulid.ULID]*metadata.DeletionMark, len(r.marks))
+	for id, m := range r.marks {
+		marks[id] = &m.mark


The reason why I commented about storing by copy/reference in cachedDeletionMark is due to the factor that here we work with pointers, while in the cache we store by copy. It just looked weird to me, but I'm not feeling strong about this.

pracucci · 2020-04-22T07:15:11Z

pkg/block/fetcher.go

+				if metaErr == nil {
+					mark, markErr = f.loadDeletionMark(ctx, id, now)
 				}


Thinking loudly: the contract of this fetchMetadata() function is that it returns found metas also in case of any error fetching any meta (so it's OK returning a partial view). I'm wondering if it's correct to never fetch the deletion mark in case of any error: ie. should we fetch it anyway in case of ErrorSyncMetaCorrupted? I don't have an answer honestly, but I would like to have a discussion on it.

pracucci · 2020-04-22T07:16:18Z

pkg/block/fetcher.go

-	}
-	f.cached = cached
+	f.cached = resp.metasCopy()
+	f.marks = resp.marks // no need to copy, as it's not going to be modified


Given deletion marks are relatively small (compared to metas), could just be more future-proof always copying them?

pracucci · 2020-04-22T07:17:52Z

pkg/block/fetcher.go

+}
+
+// deletionMarksCopyForFilter makes a copy of deletion marks map, suitable for passing to Filter method.
+func (r response) deletionMarksCopyForFilter() map[ulid.ULID]*metadata.DeletionMark {


Looks a bit weird to me calling it ForFilter(). The usage we're going to do with a copy of deletion marks shouldn't be reflected in the function name IMO.

pracucci · 2020-04-22T07:23:28Z

pkg/block/fetcher_test.go

+				expectedMetas:         ULIDs(1, 3, 6),
+				expectedCorruptedMeta: ULIDs(5),
+				expectedNoMeta:        ULIDs(4, 2),
+				expectedMetaErr:       errors.New("incomplete view: unexpected meta file: 00000000070000000000000000/meta.json version: 20"),


Shouldn't we also expect an error for the corrupted deletion mark?

pstibrany · 2020-04-22T08:07:02Z

We have deployed updated Cortex with this PR in today, and results are somewhat disappointing...

If we focus on get and exists operations only, we can see that we now issue even more operations to the storage:

It seems that reason for this is a combination of several factors:

New store-gateway component in scanning all blocks, and uses metadata filter that removes "unwanted" blocks
We now fetch deletion marks before filters are run. At the same time, in this PR we don't cache non-existant marks

Previously deletion marks were only fetched for blocks that passed "is this block interesting" filter, which resulted in fewer operations.

pstibrany · 2020-04-22T08:10:10Z

In the light of these findings, let's not merge this yet. I'd like to write a design document to suggest more generic caching approach around storage.

Cache deletion mark files together with meta files.

3a2276a

Signed-off-by: Peter Štibraný <[email protected]>

bwplotka reviewed Apr 17, 2020

View reviewed changes

pkg/block/fetcher.go Outdated Show resolved Hide resolved

pstibrany added 5 commits April 20, 2020 10:31

Recheck deletion mark files after hour of usage.

ad09efe

Signed-off-by: Peter Štibraný <[email protected]>

Randomize recheck times

448ff1d

Signed-off-by: Peter Štibraný <[email protected]>

Unit tests for deletion markers.

6e41704

Signed-off-by: Peter Štibraný <[email protected]>

Added unit test for caching of deletion markers.

0c31a32

Added caching for non-existant markers as well. Signed-off-by: Peter Štibraný <[email protected]>

Fixes spelling

24b0e13

Signed-off-by: Peter Štibraný <[email protected]>

pstibrany marked this pull request as ready for review April 20, 2020 14:39

pstibrany added 4 commits April 20, 2020 16:45

Fix sentences. Here we go again, Thanos Lint rules driving me crazy. …

0381494

…Aaaarrrrggggghhh. 😱 Signed-off-by: Peter Štibraný <[email protected]>

Fix more lint warning. 🧘‍♂️ 🧘‍♂️ 🧘‍♂️ 🤯

d54b377

Signed-off-by: Peter Štibraný <[email protected]>

Simplify loadDeletionMark to return only real errors.

49065ff

Fix unit test by flushing the cache. Signed-off-by: Peter Štibraný <[email protected]>

Lint. 🐸 🐨 🐴 🐒 🐻 🐜 🐋

78d886c

Signed-off-by: Peter Štibraný <[email protected]>

pstibrany marked this pull request as draft April 20, 2020 16:34

Removed caching of non-existent deletion mark files.

977e1d9

Signed-off-by: Peter Štibraný <[email protected]>

pstibrany marked this pull request as ready for review April 21, 2020 07:34

pstibrany added 2 commits April 21, 2020 09:39

Fixed comment.

47f2a98

Signed-off-by: Peter Štibraný <[email protected]>

No need to flush negative results anymore.

9bcb1dc

Signed-off-by: Peter Štibraný <[email protected]>

pracucci reviewed Apr 21, 2020

View reviewed changes

Address review feedback.

95b6dff

Signed-off-by: Peter Štibraný <[email protected]>

pracucci reviewed Apr 22, 2020

View reviewed changes

pstibrany closed this Apr 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache deletion mark files together with meta files. #2457

Cache deletion mark files together with meta files. #2457

pstibrany commented Apr 17, 2020

bwplotka commented Apr 17, 2020

bwplotka commented Apr 17, 2020

bwplotka commented Apr 17, 2020 •

edited

Loading

pstibrany commented Apr 17, 2020

pstibrany commented Apr 17, 2020

pstibrany commented Apr 17, 2020

pstibrany commented Apr 17, 2020

bwplotka commented Apr 17, 2020

bwplotka commented Apr 17, 2020

bwplotka commented Apr 17, 2020

bwplotka commented Apr 17, 2020

pstibrany commented Apr 20, 2020

pstibrany commented Apr 20, 2020

bwplotka commented Apr 20, 2020

pstibrany commented Apr 20, 2020

pstibrany commented Apr 21, 2020

pracucci left a comment

pracucci Apr 21, 2020

pstibrany Apr 21, 2020

pracucci Apr 22, 2020 •

edited

Loading

pstibrany commented Apr 21, 2020

pracucci left a comment

pracucci Apr 22, 2020

pracucci Apr 22, 2020

pracucci Apr 22, 2020

pracucci Apr 22, 2020

pracucci Apr 22, 2020

pstibrany commented Apr 22, 2020

pstibrany commented Apr 22, 2020

Cache deletion mark files together with meta files. #2457

Cache deletion mark files together with meta files. #2457

Conversation

pstibrany commented Apr 17, 2020

Changes

Verification

bwplotka commented Apr 17, 2020

bwplotka commented Apr 17, 2020

bwplotka commented Apr 17, 2020 • edited Loading

pstibrany commented Apr 17, 2020

pstibrany commented Apr 17, 2020

pstibrany commented Apr 17, 2020

pstibrany commented Apr 17, 2020

bwplotka commented Apr 17, 2020

bwplotka commented Apr 17, 2020

bwplotka commented Apr 17, 2020

bwplotka commented Apr 17, 2020

pstibrany commented Apr 20, 2020

pstibrany commented Apr 20, 2020

bwplotka commented Apr 20, 2020

pstibrany commented Apr 20, 2020

pstibrany commented Apr 21, 2020

pracucci left a comment

Choose a reason for hiding this comment

pracucci Apr 21, 2020

Choose a reason for hiding this comment

pstibrany Apr 21, 2020

Choose a reason for hiding this comment

pracucci Apr 22, 2020 • edited Loading

Choose a reason for hiding this comment

pstibrany commented Apr 21, 2020

pracucci left a comment

Choose a reason for hiding this comment

pracucci Apr 22, 2020

Choose a reason for hiding this comment

pracucci Apr 22, 2020

Choose a reason for hiding this comment

pracucci Apr 22, 2020

Choose a reason for hiding this comment

pracucci Apr 22, 2020

Choose a reason for hiding this comment

pracucci Apr 22, 2020

Choose a reason for hiding this comment

pstibrany commented Apr 22, 2020

pstibrany commented Apr 22, 2020

bwplotka commented Apr 17, 2020 •

edited

Loading

pracucci Apr 22, 2020 •

edited

Loading