Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experiment: streaming compaction #3438

Merged
merged 4 commits into from
Jul 23, 2024

Conversation

kolesnikovae
Copy link
Collaborator

No description provided.

@kolesnikovae kolesnikovae marked this pull request as ready for review July 23, 2024 10:36
@kolesnikovae kolesnikovae requested a review from a team as a code owner July 23, 2024 10:36
@kolesnikovae kolesnikovae merged commit aef3136 into experiment/new-architecture Jul 23, 2024
11 of 14 checks passed
@kolesnikovae kolesnikovae deleted the experiment/block-merge branch July 23, 2024 10:36
kolesnikovae added a commit that referenced this pull request Aug 13, 2024
* add metastore API definition scratch

* Add metastore API definition (#3391)

* WIP: Add dummy metastore (#3394)

* Add dummy metastore

* Add metastore client (#3397)

* experiment: new architecture deployment (#3401)

* remove query-worker service for now

* fix metastore args

* enable persistence for metastore

* WIP: Distribute profiles based on tenant id, service name and labels (#3400)

* Distribute profiles to shards based on tenant/service_name/labels with no replication

* Add retries in case of delivery issues (WIP)

* Calculate shard for ingested profiles, send to ingesters in push request

* Set replication factor and token count via config

* Fix tests

* Run make helm/check check/unstaged-changes

* Run make reference-help

* Simplify shard calculation

* Add a metric for distributed bytes

* Register metric

* Revert undesired change

* metastore bootstrap include self

* fix ingester ring replication factor

* delete helm workflows

* wip: segments writer (#3405)

* start working on segments writer

add shards

awaiting segment flush in requests

upload blocks

add some tracing

* upload meta in case of metastore error

* do not upload metadata to dlq

* add some flags

* skip some tests. fmt

* skip e2e tests

* maybe fix microservices_test.go. I have no idea what im doing

* change partition selection

* rm e2e yaml

* fmt

* add compaction planner API definition

* rm unnecesary nested dirs

* debug log head stats

* more debug logs

* fix skipping empty head

* fix tests

* pass shards

* more debug logs

* fix nil deref in ingester

* more debugs logs in segmentsWriter

* start collecting some metrics for segments

* hack: purge state segments

* hack: purge stale segments on the leader

* more segment metrics

* more segment metrics

* more segment metrics

* more segment metrics

* make fmt

* more segment metrics

* fix panic caused by the metric with the same name

* more segment metrics

* more segment metrics

* make fmt

* decrease page buffer size

* decrease page buffer size, tsdb buffer writer size

* separate parquet write buffer size for segments and compacted blocks

* separate index write buffer size for segments and compacted blocks

* improve segment metrics - decrease cardinality by removing service as label

* fix head metrics recreation via phlarectx usage ;(

* try to pool newParquetProfileWriter

* Revert "try to pool newParquetProfileWriter"

This reverts commit d91e3f1.

* decrease tsdb index buffers

* decrease tsdb index buffers

* experiment: add query backend (#3404)

* Add query-backend component

* add generated code for connect

* add mergers

* block reader scratches

* block reader updates

* query frontend integration

* better query planning

* tests and fixes

* improve API

* profilestore: use single row group

* profile_store: pool profile parquet writer

* profiles parquet encoding: fix profile column count

* segments: rewrite shards to flush independently

* make fmt

* segments: flush heads concurrently

* segments: tmp debug log

* segments: change wait duration metric buckets

* add inmemory tsdb index writer

* rm debug log

* use inmemory index writer

* remove FileWriter from inmem index

* inmemory tsdb index writer: reuse buffers through pool

* inmemory tsdb index writer: preallocate initial buffers

* segment: concat files with preallocated buffers

* experiment: query backend block reader (#3423)

* simplify report merge handling

* implement query context and tenant service section

* implement LabelNames and LabelValues

* bind tsdb query api

* bind time series query api

* better caching

* bind stack trace api

* implement tree query

* fix offset shift

* update helm chart

* tune buffers

* add block object size attribute

* serve profile type query from metastore

* tune grpc server config

* segment: try to use memfs

* Revert "segment: try to use memfs"

This reverts commit 798bb9d.

* tune s3 http client

Before getting too deep with a custom TLS VerifyConnection function, it makes sense to ensure that we reuse connections as much as possible

* WIP: Compaction planning in metastore (#3427)

* First iteration of compaction planning in metastore

* Second iteration of compaction planning in metastore

* Add GetCompactionJobs back

* Create and persist jobs in the same transaction as blocks

* Add simple logging for compaction planning

* Fix bug

* Remove unused proto message

* Remove unused raft command type

* Add a simple config for compaction planning

* WIP (new architecture): Add compaction workers, Integrate with planner (#3430)

* Add compaction workers, Integrate with planner (wip)

* Fix test

* add compaction-worker service

* refactor compaction-worker out of metastore

* prevent bootstrapping a single node cluster on silent DNS failures

* Scope compaction planning to shard+tenant

* Improve state handling for compaction job status updates

* Remove import

* Reduce parquet buffer size for compaction workers

* Fix another case of compaction job state inconsistency

* refactor block reader out

* handle nil blocks more carefully

* add job priority queue with lease expiration and fencing token

* disable boltdb sync

We only use it to make snapshots

* extend raft handlers with the raft log command

* Add basic compaction metrics

* Improve job assignments and status update logic

* Remove segment truncation command

* Make compaction worker job capacity configurable

* Fix concurrent map access

* Fix metric names

* segment: change segment duration from 1s to 500ms

* update request_duration_seconds buckets

* update request_duration_seconds buckets

* add an explicit parameter that controls how many raft peers to expect

* fix the explicit parameter that controls how many raft peers to expect

* temporary revert temporary hack

I'm reverting it temporary to protect metastore from running out of memory

* add some more metrics

* add some pprof tags for easier visibility

* add some pprof tags for easier visibility

* add block merge draft

* add block merge draft

* update metrics buckets again

* update metrics buckets again

* Address minor consistency issues, improve naming, in-progress updates

* increase boltdb InitialMmapSize

* Improve metrics, logging

* Decrease buffer sizes further, remove completed jobs

* Scale up compaction workers and their concurrency

* experiment: implement shard-aware series distribution (#3436)

* tune boltdb snapshotting

- increase initial mmap size
- keep less entries in the raft log
- trigger more frequently

* compact job regardless of the block size

* ingester & distributor scaling

* update manifests

* metastore ready check

* make fmt

* Revert "make fmt"

This reverts commit 8a55391.

* Revert "metastore ready check"

This reverts commit 98b05da.

* experiment: streaming compaction (#3438)

* experiment: stream compaction

* fix stream compaction

* fix parquet footer optimization

* Persist compaction job pre-queue

* tune compaction-worker capacity

* Fix bug where compaction jobs with level 1 and above are not created

* Remove blocks older than 12 hours (instead of 30 minutes)

* Fix deadlock when restoring compaction jobs

* Add basic metrics for compaction workers

* Load compacted blocks in metastore on restore

* experimenting with object prefetch size

* experimenting with object prefetch

* trace read path

* metastore readycheck

* metastore readycheck

* metastore readycheck. trigger rollout

* metastore readycheck. trigger rollout

* segments, include block id in errors

* metastore: log addBlock error

* segments: maybe fix retries

* segments: maybe fix retries

* segments: more debug logging

* refactor query result aggregation

* segments: more debug logging

* segments: more debug logging

* tune resource requests

* tune compaction worker job capacity

* fix time series step unit

* Update golang version to 1.22.4

* enable grpc tracing for ingesters

* expose /debug/requests

* more debug logs

* reset state when restoring from snapshot

* Add debug logging

* Persist job raft log index and lease expiry after assignment

* Scale up a few components in the dev environment

* more debug logs

* metastore clinet: resolve addresses from endpointslice instead of dns

* Update frontend_profile_types.go

* metastore: add extra delay for readyness check

* metastore: add extra delay for readyness check

* metastore client: more debug log

* fix compaction

* stream statistics tracking: add HeavyKeeper implementation

* Bump compaction worker resources

* Bump compaction worker resources

* improve read path load distribution

* handle compaction synchronously

* revert CI/CD changes

* isolate experimental code

* rollback initialization changes

* rollback initialization changes

* isolate distributor changes

* isolate ingester changes

* cleanup experimental code

* remove large tsdb fixture copy

* update cmd tests

* revert Push API changes

* cleanup dependencies

* cleanup gitignore

* fix reviewdog

* go mod tidy

* make generate

* revert changes in tsdb/index.go

* Revert "revert changes in tsdb/index.go"

This reverts commit 2188cde.

---------

Co-authored-by: Aleksandar Petrov <[email protected]>
Co-authored-by: Tolya Korniltsev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant