Skip to content
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.

linghtning/backend: optimize local writer concurrency and memory usage (#753) #1075

Closed

Conversation

ti-srebot
Copy link
Contributor

@ti-srebot ti-srebot commented Apr 29, 2021

cherry-pick #753 to release-5.0
You can switch your code base to this Pull Request by using git-extras:

# In br repo:
git pr https://github.com/pingcap/br/pull/1075

After apply modifications, you can push your change to this PR via:

git push [email protected]:ti-srebot/br.git pr/1075:release-5.0-6fd7b9ab4612

What problem does this PR solve?

Optimize local writer performance and memory usage.

What is changed and how it works?

Optimize local writer by:

  • Open an index LocalWriter for each chunk to avoid the bottle neck at write index kvs
  • Replace the async ApplendRows with sync operation, to optimize the memory usage and simplify the logic
  • decrease the encode/deliverLoop chan to decrease memory usage
  • use a memory buffer to restore the temp kv pairs before write them to SST file
  • Manually compaction small SST files into a single SST file and then ingest it into pebble

Bench result:
The benchmark tests were run on a 40core machine with an NVME disk. and based on the following three data and table schema:

  • DataSet1. 14k warehouse tpcc data
  • DataSet2. 1k warehouse order_line table with 3 indexes. Thus each row generates 4 kvs.
    PRIMARY KEY (`ol_w_id`,`ol_d_id`,`ol_o_id`,`ol_number`),
    KEY `idx_d_i` (`ol_d_id`, `ol_i_id`),
    KEY `idx_d_w_supply` (`ol_d_id`, `ol_w_id`, `ol_supply_w_id`)
    
  • DataSet3. 1k warehouse order_line table with 3 indexes. Thus each row generates 8 kvs.
    PRIMARY KEY (`ol_w_id`,`ol_d_id`,`ol_o_id`,`ol_number`),
    KEY `idx_d_i` (`ol_d_id`,`ol_i_id`),
    KEY `idx_d_w_supply` (`ol_d_id`,`ol_w_id`,`ol_supply_w_id`),
    KEY `idx_o_amount` (`ol_o_id`,`ol_amount`),
    KEY `idx_d_supply` (`ol_d_id`,`ol_supply_w_id`),
    KEY `idx_o_d_i` (`ol_o_id`,`ol_d_id`,`ol_i_id`),
    KEY `idx_i_id` (`ol_i_id`)
    

Bench Result:

DataSet Branch Peak Memory Cost Time
DataSet1 master 34GB 2h25m
DataSet1 opt-local-writer 29GB 1h46m
DataSet2 master 30GB 24m57s
DataSet2 opt-local-writer 30GB 9m32s
DataSet3 master 64GB 1h22m
DataSet3 opt-local-writer 33GB 33m

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Code changes

  • Has interface methods change

Side effects

  • Increased code complexity
  • Breaking backward compatibility

Related changes

  • Need to cherry-pick to the release branch

Release Note

  • Optimize lightning local backend memory efficiency and performance

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

[ $(ls -1q "$TEST_DIR/$TEST_NAME.sorted" | grep -E "^\S{36}$" | wc -l) -eq 2 ]
=======
[ $(ls -1q "$TEST_DIR/$TEST_NAME.sorted" | grep -E "^\S{36}$" | wc -l) -eq 2 ]
>>>>>>> 6fd7b9ab... linghtning/backend: optimize local writer concurrency and memory usage (#753)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge conflict

mutex sync.Mutex
=======
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge conflict

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need other cherry-picked pr to be merged first. 😂

@glorv glorv closed this May 27, 2021
@Leavrth Leavrth modified the milestone: v5.0.2 May 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants