-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/receive: basic hashring support #1217
Conversation
Very nice. I like that it is so simple. Should be a good start! /cc @bwplotka |
a965c98
to
ee2872c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that the simplicity of this is positively contributing to my confidence in this. This goes in a really great direction, just a few comments and when addressed we should be good to go I think.
pkg/receive/hashring.go
Outdated
// for a specified tenant. | ||
// It returns the hostname and any error encountered. | ||
type Hashring interface { | ||
GetHost(string, *prompb.TimeSeries) (string, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we name these parameters here already, just so it's clear what the input and outputs are for this interface
} | ||
|
||
// hash returns a hash for the given tenant and time series. | ||
func hash(tenant string, ts *prompb.TimeSeries) uint64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing to do, just a side note to think about for the future: we probably want to at some point cache this, as sorting + hashing seems rather expensive, plus we can probably re-use the byte slices created for these hashes, but let's leave these optimizations for later until we actually measured that they are necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we'll probably want to make a buffer pool to keep allocations low. caching may be expensive as there can easily tens of millions of time series + tenant combinations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah optimizing here now would be premature, once it becomes a problem we can try out the different strategies and prove with numbers what's best, nothing to do for now
sort.Slice(ts.Labels, func(i, j int) bool { return ts.Labels[i].Name < ts.Labels[j].Name }) | ||
|
||
b := make([]byte, 0, 1024) | ||
b = append(b, []byte(tenant)...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs a separator at the end, otherwise with clever label names collisions could be engineered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch here
pkg/receive/hashring.go
Outdated
|
||
// NewHashring creates a multi-tenant hashring for a given | ||
// configuration that maps tenants to groups of hosts. | ||
func NewHashring(tenants map[string][]string) Hashring { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's architect this in a way right now where we instead of a direct map, have a list of filters that if all passed decides for that particular hashring, and for now just implement an exact-match-filter
@@ -5,6 +5,7 @@ require ( | |||
github.com/Azure/azure-storage-blob-go v0.0.0-20181022225951-5152f14ace1c | |||
github.com/NYTimes/gziphandler v1.1.1 | |||
github.com/armon/go-metrics v0.0.0-20180917152333-f0300d1749da | |||
github.com/cespare/xxhash v1.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perfect choice 👍
let's document that we're using this hash in the design doc/proposal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is still to be addressed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked back, and it was actually already included it in the proposal: https://github.com/improbable-eng/thanos/blame/64e7a2607392289bec5d39b9c40fb6c261eed9ba/docs/proposals/201812_thanos-remote-receive.md#L96
pkg/receive/hashring.go
Outdated
|
||
// NewHashring creates a multi-tenant hashring for a given | ||
// configuration that maps tenants to groups of hosts. | ||
func NewHashring(groups []*targetgroup.Group) Hashring { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can technically change at runtime, no? I'm ok with skipping this aspect for now, but we either need to make clear that only what is discovered at startup is what is discovered over the lifetime of the process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually wrote another hashring implementation called discoveryHashring
, which updates the hashring as new peers are discovered, but decided to throw it out as we need need to shut down the receiver anyways when new peers are discovered, so there was little benefit in having a dynamic hashring right now.
The other choice is to make it updateable on the hashring-side of things and still cleanly shutdown the receiver anyways when the hashring changes. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was more saying that if we ever detect a change of these groups at runtime, we should log that that has happened.
pkg/receive/hashring.go
Outdated
} | ||
|
||
// multiTenantHashring represents a set of Hashrings for different tenants. | ||
type multiTenantHashring map[string]Hashring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a direct map, let's use a list of objects that contain a list of filters, that if all filters are passed, the hashring is chosen. If necessary we can cache this at runtime into a data-structure like this one, as this data is perfectly cachable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, sorry for lag (:
* go.mod: add xxhash for hashing * pkg/receive: add simple hashring
Changes
This PR adds a basic hashring type to the receive package. The hashring does not implement consistent hashing because today we do not care about needing to shift data around between scale-up/scale-down. The current plan is that at every scale event we flush all data, write it to block storage, and start again.
Going forward, a PR will embed a Hashring into the Handler struct so that when handling requests, the Handler can decide if it needs to handle the request itself or forward it to another host.
Forwarding functionality will be added in a follow up.
Verification
Unit testing of hashring.
cc @metalmatze @brancz