Skip to content
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.

linghtning/backend: optimize local writer concurrency and memory usage #753

Merged
merged 46 commits into from
Apr 29, 2021

Conversation

glorv
Copy link
Collaborator

@glorv glorv commented Feb 25, 2021

What problem does this PR solve?

Optimize local writer performance and memory usage.

What is changed and how it works?

Optimize local writer by:

  • Open an index LocalWriter for each chunk to avoid the bottle neck at write index kvs
  • Replace the async ApplendRows with sync operation, to optimize the memory usage and simplify the logic
  • decrease the encode/deliverLoop chan to decrease memory usage
  • use a memory buffer to restore the temp kv pairs before write them to SST file
  • Manually compaction small SST files into a single SST file and then ingest it into pebble

Bench result:
The benchmark tests were run on a 40core machine with an NVME disk. and based on the following three data and table schema:

  • DataSet1. 14k warehouse tpcc data
  • DataSet2. 1k warehouse order_line table with 3 indexes. Thus each row generates 4 kvs.
    PRIMARY KEY (`ol_w_id`,`ol_d_id`,`ol_o_id`,`ol_number`),
    KEY `idx_d_i` (`ol_d_id`, `ol_i_id`),
    KEY `idx_d_w_supply` (`ol_d_id`, `ol_w_id`, `ol_supply_w_id`)
    
  • DataSet3. 1k warehouse order_line table with 3 indexes. Thus each row generates 8 kvs.
    PRIMARY KEY (`ol_w_id`,`ol_d_id`,`ol_o_id`,`ol_number`),
    KEY `idx_d_i` (`ol_d_id`,`ol_i_id`),
    KEY `idx_d_w_supply` (`ol_d_id`,`ol_w_id`,`ol_supply_w_id`),
    KEY `idx_o_amount` (`ol_o_id`,`ol_amount`),
    KEY `idx_d_supply` (`ol_d_id`,`ol_supply_w_id`),
    KEY `idx_o_d_i` (`ol_o_id`,`ol_d_id`,`ol_i_id`),
    KEY `idx_i_id` (`ol_i_id`)
    

Bench Result:

DataSet Branch Peak Memory Cost Time
DataSet1 master 34GB 2h25m
DataSet1 opt-local-writer 29GB 1h46m
DataSet2 master 30GB 24m57s
DataSet2 opt-local-writer 30GB 9m32s
DataSet3 master 64GB 1h22m
DataSet3 opt-local-writer 33GB 33m

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Code changes

  • Has interface methods change

Side effects

  • Increased code complexity
  • Breaking backward compatibility

Related changes

  • Need to cherry-pick to the release branch

Release Note

  • Optimize lightning local backend memory efficiency and performance

@glorv glorv force-pushed the opt-local-writer branch 2 times, most recently from 9f34e01 to 0f2442c Compare March 1, 2021 03:06
@kennytm
Copy link
Collaborator

kennytm commented Apr 29, 2021

/merge

@ti-chi-bot
Copy link
Member

@kennytm: /merge in this pull request requires 3 /lgtm.

In response to this:

/merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@kennytm
Copy link
Collaborator

kennytm commented Apr 29, 2021

Huh.

Copy link
Collaborator

@kennytm kennytm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

@@ -434,7 +820,7 @@ func NewLocalBackend(
g glue.Glue,
maxOpenFiles int,
) (backend.Backend, error) {
localFile := cfg.SortedKVDir
File := cfg.SortedKVDir
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why rename the variable. as least use a lower case name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must be done by the IDE when I refactor the LocalFile struct..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1-fs8

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂😂😂

@@ -628,27 +1036,62 @@ func (local *local) openEngineDB(engineUUID uuid.UUID, readOnly bool) (*pebble.D
// set threshold to half of the max open files to avoid trigger compaction
L0CompactionThreshold: math.MaxInt32,
L0StopWritesThreshold: math.MaxInt32,
LBaseMaxBytes: 16 << 40,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LBaseMaxBytes: 16 << 40,
LBaseMaxBytes: 16 * units.TiB,

Comment on lines 1050 to 1051
// 16GB
TargetFileSize: 16 << 30,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// 16GB
TargetFileSize: 16 << 30,
TargetFileSize: 16 * units.GiB,

Comment on lines 2748 to 2752
iter, err := reader.NewIter([]byte{'t'}, []byte{'u'})
if err != nil {
return nil, errors.Trace(err)
}
key, val := iter.SeekGE([]byte{'t'})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 seems unnecessary to restrict the range. we don't store non-t prefixed stuff in the SST.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glorv (line 2748 is included)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I supposed the NewIter must receive a not empty bound, but found we can just pass nil, nil instead. 😂

Comment on lines 84 to 85
compactionLowerThreshold = 512 << 20 // 512M
compactionUpperThreshold = 32 << 30 // 32GB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
compactionLowerThreshold = 512 << 20 // 512M
compactionUpperThreshold = 32 << 30 // 32GB
compactionLowerThreshold = 512 * units.MiB
compactionUpperThreshold = 32 * units.GiB

@glorv
Copy link
Collaborator Author

glorv commented Apr 29, 2021

Huh.

I add the require-LGT3 label for this PR since the change is quite big. Please approve this PR, then we can merge it. @kennytm 😅

@kennytm
Copy link
Collaborator

kennytm commented Apr 29, 2021

@glorv see above 😅

@glorv
Copy link
Collaborator Author

glorv commented Apr 29, 2021

/run-all-tests

@kennytm
Copy link
Collaborator

kennytm commented Apr 29, 2021

/lgtm

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • Little-Wallace
  • kennytm
  • lance6716

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added status/LGT3 LGTM3. This PR looks very good to our bot. and removed status/LGT2 LGTM2 labels Apr 29, 2021
@kennytm
Copy link
Collaborator

kennytm commented Apr 29, 2021

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 1dff359

@ti-chi-bot ti-chi-bot merged commit 6fd7b9a into pingcap:master Apr 29, 2021
ti-srebot pushed a commit to ti-srebot/br that referenced this pull request Apr 29, 2021
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #1074

ti-srebot pushed a commit to ti-srebot/br that referenced this pull request Apr 29, 2021
@ti-srebot
Copy link
Contributor

cherry pick to release-5.0 in PR #1075

@glorv glorv deleted the opt-local-writer branch April 29, 2021 10:00
@glorv glorv restored the opt-local-writer branch April 29, 2021 11:45
@glorv glorv deleted the opt-local-writer branch April 29, 2021 11:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants