Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: initial twcs impl #1851

Merged
merged 12 commits into from
Jul 4, 2023
Merged

feat: initial twcs impl #1851

merged 12 commits into from
Jul 4, 2023

Conversation

v0y4g3r
Copy link
Contributor

@v0y4g3r v0y4g3r commented Jun 29, 2023

I hereby agree to the terms of the GreptimeDB CLA

What's changed and what's your intention?

This PR adds support for TWCS compaction strategy inspired by Cassandra.

In previous leveled time window compaction strategy, SST files are strictly splitted and aligned to time windows. This may cause read amplification especially when a SST file contains rows from multiple time windows, even the out-of-order data is sparse.

TWCS computes SST file time window by their max timestamps in that out-of-order time-series data is mostly history data, it only affects min timestamps.

After assigning files into time windows, we then traverse every window to ensure that:

  • for actively writing window, we allow for multiple SST files to alleviate write amplification
  • for other windows, we allow at most one file per window for better read performance.

TWCS can be enable with table options:

CREATE TABLE IF NOT EXISTS cpu_metrics (
    hostname STRING,
    environment STRING,
    usage_user DOUBLE,
    usage_system DOUBLE,
    usage_idle DOUBLE,
    ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    TIME INDEX(ts),
    PRIMARY KEY(hostname, environment)

) WITH ('compaction'='twcs', 'compaction.twcs.max_active_window_files'=12, 'compaction.twcs.max_inactive_window_files'=3);

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.

Refer to a related PR or issue link (optional)

Fixes #1068

@v0y4g3r v0y4g3r force-pushed the feat/twcs branch 4 times, most recently from ccb6721 to b4ffa07 Compare June 30, 2023 04:51
@v0y4g3r v0y4g3r marked this pull request as ready for review June 30, 2023 04:51
@codecov
Copy link

codecov bot commented Jun 30, 2023

Codecov Report

Merging #1851 (3179778) into develop (b8e9229) will decrease coverage by 0.32%.
The diff coverage is 83.78%.

@@             Coverage Diff             @@
##           develop    #1851      +/-   ##
===========================================
- Coverage    86.48%   86.17%   -0.32%     
===========================================
  Files          595      596       +1     
  Lines        96640    97143     +503     
===========================================
+ Hits         83582    83710     +128     
- Misses       13058    13433     +375     

src/storage/src/compaction/picker.rs Outdated Show resolved Hide resolved
src/storage/src/compaction/twcs.rs Show resolved Hide resolved
src/storage/src/compaction/twcs.rs Show resolved Hide resolved
src/storage/src/compaction/twcs.rs Outdated Show resolved Hide resolved
src/storage/src/compaction/twcs.rs Outdated Show resolved Hide resolved
src/common/time/src/timestamp_millis.rs Show resolved Hide resolved
src/storage/src/compaction.rs Outdated Show resolved Hide resolved
src/storage/src/compaction/picker.rs Outdated Show resolved Hide resolved
src/storage/src/region.rs Outdated Show resolved Hide resolved
src/storage/src/compaction/twcs.rs Show resolved Hide resolved
@v0y4g3r v0y4g3r added docs-required This change requires docs update. C-performance Category Performance Size: L labels Jul 4, 2023
Copy link
Member

@waynexia waynexia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

src/store-api/src/storage/engine.rs Outdated Show resolved Hide resolved
src/store-api/src/storage/engine.rs Outdated Show resolved Hide resolved
src/common/time/src/timestamp_millis.rs Show resolved Hide resolved
src/common/time/src/timestamp_millis.rs Show resolved Hide resolved
@v0y4g3r v0y4g3r merged commit 3b6f70c into GreptimeTeam:develop Jul 4, 2023
paomian pushed a commit to paomian/greptimedb that referenced this pull request Oct 19, 2023
* feat: initial twcs impl

* chore: rename SimplePicker to LeveledPicker

* rename some structs

* Remove Compaction strategy

* make compaction picker a trait object

* make compaction picker configurable for every region

* chore: add some test for ttl

* add some tests

* fix: some style issues in cr

* feat: enable twcs when creating tables

* feat: allow config time window when creating tables

* fix: some cr comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Category Performance docs-required This change requires docs update.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Time Window CompactionStrategy
4 participants