Thanos receive replication waits for unnecessary requests to finish #2567

brancz · 2020-05-05T15:58:35Z

Thanos, Prometheus and Golang version used:

master-2020-05-05-e5804d80

Object Storage Provider:

s3

What happened:

When rolling out a new version of Thanos receive, the latency of ingestion requests massively spike so I tried to investigate why and I could nail it down to the fact that the replication strategy always waits for all requests to other instances to finish, even if quorum requests have already succeeded.

What you expected to happen:

Only wait for quorum success of replication requests.

How to reproduce it (as minimally and precisely as possible):

3x replication thanos receive setup with one instance being unavailable.

Full logs to relevant components:

n/a

Anything else we need to know:

My env is on Kubernetes, but this is irrelevant to the issue described.

@bwplotka @krasi-georgiev @metalmatze @squat @kakkoyun

The text was updated successfully, but these errors were encountered:

brancz · 2020-05-05T15:59:40Z

This shouldn't actually be too difficult, it just needs a refactoring of the parallelizeRequests function to not wait for all to return:

thanos/pkg/receive/handler.go

Line 333 in e5804d8

    
           func (h *Handler) parallelizeRequests(ctx context.Context, tenant string, replicas map[string]replica, wreqs map[string]*prompb.WriteRequest) error {

stale · 2020-06-04T17:56:11Z

Hello 👋 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

brancz · 2020-06-09T11:28:47Z

Closed by #2621 and #2679

stale bot added the stale label Jun 4, 2020

kakkoyun added bug component: receive and removed stale labels Jun 5, 2020

brancz closed this as completed Jun 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thanos receive replication waits for unnecessary requests to finish #2567

Thanos receive replication waits for unnecessary requests to finish #2567

brancz commented May 5, 2020

brancz commented May 5, 2020

stale bot commented Jun 4, 2020

brancz commented Jun 9, 2020

Thanos receive replication waits for unnecessary requests to finish #2567

Thanos receive replication waits for unnecessary requests to finish #2567

Comments

brancz commented May 5, 2020

brancz commented May 5, 2020

stale bot commented Jun 4, 2020

brancz commented Jun 9, 2020