Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add time based partitioning to store component #957

Closed
wants to merge 5 commits into from

Conversation

claytono
Copy link
Contributor

@claytono claytono commented Mar 21, 2019

Changes

Add --min-time and --max-time command-line flags to store component. This will cause the store component to only load blocks whose start time lies within the time given. These two can be given independently and the defaults are effectively "forever ago" and "forever in the future" respectively.

Verification

I've tested this against a copy of one of our production buckets and I've updated the e2e tests to test this functionality.

cc: #814

@claytono claytono marked this pull request as ready for review March 21, 2019 15:09
@bwplotka
Copy link
Member

CC @povilasv PTAL

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, will do more detailed review soon!

@@ -145,6 +145,31 @@ func modelDuration(flags *kingpin.FlagClause) *model.Duration {
return value
}

type flagTime struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I belive this is quite fixed... We essentially need duration right? Like now-3months - now-2h style.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree I need relative time also, would essentially replace the functionality I got here #930

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be amazing!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both would be useful. We're intending to use this for horizontal scale out, and we want the ability to specify exact time ranges in order to size the blocks served to the capacity of the thanos store host.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think both are useful

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

One question: I don't get why someone would need specific time. I think it does not make sense. What's the use case?

We're intending to use this for horizontal scale out, and we want the ability to specify exact time ranges in order to size the blocks served to the capacity of the thanos store host.

@claytono hmm are you sure you want fixed time ranges for that? Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We intend to have a handful of Thanos store nodes, each serving a portion of a shared bucket, with one of them having an open ended time range for all new metrics. For now, we intend to have an outside process periodically do analysis of the bucket and generate time ranges for each thanos store process based on index size for each block. We want to provision these nodes such that they're all fairly full from a memory standpoint, but that we're not over-provisioning. For us, the major expense of running Thanos is the memory on compute instances, and the S3 storage is nearly free in comparison.

With metric ingest rates changing over time (new apps, seasonality, etc) and the activity of the compactor, I think partitioning the bucket time ranges with relative times is going to be error prone and/or lead to inefficient usage of the hardware.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of odd from my perspective, but if you find this useful, sure (: happy to accept that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's what we've come up with for horizontal scaling of the Thanos store nodes. I'd love to hear how other people are managing scaling out.

Copy link
Contributor

@xjewer xjewer Mar 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For us, the major expense of running Thanos is the memory on compute instances

Seems, you are trying to solve separate problem with absolute time ranges.

I'd like to have relative time as well.

@povilasv
Copy link
Member

FYI I've continued work on different PR #1077

@povilasv povilasv closed this Apr 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants