feat:change thread sheduling method in ThreadPool class #2648

QlQlqiqi · 2024-05-12T04:56:03Z

The logic is based on function `WriteThread::AwaitState` in rocksdb. link

Before:

All workers and main thread which pushs task in queue both are waiting the same lock. It can cause very intense competition.
When a worker has finished one task, it will try to get lock again for a new task through function await. It can make the worker sleep with high probability due to intense competition. And it can cost much time to sleep and wake up.

After:

This is a standard producer-consumer model. So we can use lock-free list to deal with this problem about intense competition.
When a worker wake up, it will try to get tasks. And when it find there is no tasks, it will try to loop for a while to wait for new tasks. Because with high throughput the time for waiting new tasks is very short, so this loop will NOT cause serious block. In order to reduce the block time, the loop has 3 level.
2.1. 1-level. Using spin-loop to wait.
2.2. 2-level. Using long-time-loop to wait. The worker maybe yield the cpu when some condition is reached. And using a data to store probability of entering 2-level loop.
2.3. 3-level. Using function await to wait for new tasks.

params

the count of 1-level loop:
default: 200. Too much number maybe cause high cpu load. Too few number maybe cause vain opration.
queue_slow_size_:
default: std::min(worker_num, 100). When the number of tasks in queue exceeds it, the main thread which call function Schedule call std::this_thread::yield().
max_queue_size_:
default: max_queue_size. When the number of tasks in queue exceeds it, the main thread which call function Schedule call std::this_thread::yield() till the number of tasks in queue is less than threshold.
max_yield_usec_:
default: 100. The max time of loop in 2-level loop.
slow_yield_usec_:
default: 3. If the time the function std::this_thread::yield() spends exceeds the threshold, the data sorce may be updated.
kMaxSlowYieldsWhileSpinning:
default: 3. If the times of reaching condition above(5), the data sorce will be updated.
sampling_base:
default: 256. It represent the provability of enter 2-level loop is not lower than 1/sampling_base.

Summary by CodeRabbit

Refactor
- Major restructuring of the thread pool's internal architecture for improved task handling, worker management, and synchronization.
- Significant changes to the ThreadPool and Worker classes for better performance and scalability.
New Features
- Introduced methods for task execution, delayed scheduling, and synchronization within the thread pool.
Improvements
- Enhanced logic for managing multiple links and initializing workers.
- Updated task scheduling mechanisms to support better performance and reliability.
- Increased thread and pool sizes for improved resource utilization and performance.

src/net/include/random.h

src/net/include/likely.h

…rting pika in centos

cheniujh · 2024-05-23T14:01:48Z

src/net/src/thread_pool.cc

+      // 1. loop for short time
+      for (uint32_t tries = 0; tries < 200; ++tries) {
+        if (newest_node_.load(std::memory_order_acquire) != nullptr) {
+          last = newest_node_.exchange(nullptr);


这里先到的线程直接摘了整个链表，据为己有，在去线性消费，可能会导致延迟波动大，建议尽量将任务均匀分给线程池里的worker。毕竟Pika读写链路上都是自己的线程，和rocksdb的线程模型差异比较大（Rocksdb这块都是application线程在对每个writer并发），这一块可能得多一些考量。

我想了一下，一次只取一定数量的 task 大概有两种办法：
1、一个 worker 对应一个无锁链表，然后新的 task 就随机或者遍历地往这些链表中加；
2、依旧使用一个无锁链表，但是无锁链表的容量较低，比如为 10 个这样的，这样保证一个 worker 一次最多取 10 个。

第二个方法直接测试就行，第一个方法见我新的分支：https://github.com/QlQlqiqi/pika/tree/change-thread-shedule-with-mutil-list

我这里测的结果是这两个方法速度不相上下，当然如果调参合适的话应该会有较大的差距

AlexStocks · 2024-06-05T07:28:10Z

@CodiumAI-Agent /review

CodiumAI-Agent · 2024-06-05T07:30:25Z

PR Review 🔍

⏱️ Estimated effort to review [1-5]	4, because the PR involves significant changes to the threading and scheduling logic in a thread pool implementation, which is a complex area requiring careful review to ensure thread safety, performance, and correctness. The changes include the introduction of lock-free structures and a detailed scheduling mechanism which needs thorough testing and validation.
🧪 Relevant tests	No
⚡ Possible issues	Possible Bug: The use of `std::this_thread::yield()` in the `Schedule` method may not be sufficient to handle high contention gracefully. This could lead to performance degradation under heavy load.
⚡ Possible issues	Thread Safety Concern: The lock-free implementation using `std::atomic` and manual memory management (e.g., `new` and `delete` for `Node` objects) must be carefully reviewed to ensure there are no race conditions or memory leaks.
🔒 Security concerns	No

Code feedback:

relevant file	src/net/src/thread_pool.cc
suggestion	Consider using a condition variable instead of busy-waiting with `std::this_thread::yield()` in the `Schedule` method. This change could improve CPU usage and responsiveness under high load. [important]
relevant line	std::this_thread::yield();

relevant file	src/net/src/thread_pool.cc
suggestion	Implement proper memory management for `Node` objects to prevent memory leaks. Consider using smart pointers (e.g., `std::unique_ptr`) instead of raw pointers for automatic memory management. [important]
relevant line	auto node = new Node(func, arg);

relevant file	src/net/src/thread_pool.cc
suggestion	To prevent potential race conditions with the lock-free list, ensure that all operations on `std::atomic` pointers in `LinkOne` and related functions are correctly ordered using memory order tags. [important]
relevant line	if (newest_node->compare_exchange_weak(nodes, node)) {

relevant file	src/net/src/thread_pool.cc
suggestion	Add error handling for the scenario where `new Node` fails to allocate memory, which could lead to a null pointer dereference. [medium]
relevant line	auto node = new Node(func, arg);

AlexStocks · 2024-06-05T07:39:32Z

@CodiumAI-Agent /improve

CodiumAI-Agent · 2024-06-05T07:40:43Z

PR Code Suggestions ✨

Category	Suggestion	Score
Possible bug	Add exception handling for memory allocation to prevent leaks Ensure proper exception handling for dynamic memory allocation when creating new `Node` objects to prevent memory leaks and ensure program stability. src/net/src/thread_pool.cc [112-113] -auto node = new Node(func, arg); -LinkOne(node, &newest_node_); +Node* node = nullptr; +try { + node = new Node(func, arg); + LinkOne(node, &newest_node_); +} catch (const std::bad_alloc& e) { + // Handle memory allocation failure, e.g., log error, clean up, and throw if necessary + delete node; + throw; +} Suggestion importance[1-10]: 9 Why: Adding exception handling for dynamic memory allocation is crucial to prevent memory leaks and ensure program stability. This is a significant improvement addressing a potential bug.	9
Best practice	Use smart pointers for automatic memory management Replace the manual memory management with smart pointers to avoid manual deletion and improve code safety and readability. src/net/src/thread_pool.cc [112-113] -auto node = new Node(func, arg); -LinkOne(node, &newest_node_); +auto node = std::make_unique<Node>(func, arg); +LinkOne(node.get(), &newest_node_); Suggestion importance[1-10]: 8 Why: Using smart pointers enhances code safety and readability by automating memory management, reducing the risk of memory leaks and manual deletion errors. This is a best practice for modern C++.	8
Enhancement	Use a thread-safe queue to simplify task scheduling Consider using a thread-safe queue or priority queue that encapsulates the synchronization logic internally, rather than manually managing the synchronization and node linking in `ThreadPool::Schedule`. src/net/src/thread_pool.cc [112-113] -auto node = new Node(func, arg); -LinkOne(node, &newest_node_); +task_queue.push(std::make_unique<Node>(func, arg)); Suggestion importance[1-10]: 8 Why: Encapsulating synchronization logic within a thread-safe queue can simplify the code and reduce the potential for synchronization errors. This enhancement improves code maintainability and readability.	8
Performance	Replace busy waiting with a sleep mechanism to improve CPU efficiency Consider using a more efficient locking mechanism or lock-free data structures for the `ThreadPool::Schedule` method to avoid potential performance bottlenecks due to the use of `std::this_thread::yield()` in a busy loop. This can lead to thread starvation and inefficient CPU usage. src/net/src/thread_pool.cc [103-109] while (node_cnt_.load(std::memory_order_relaxed) >= max_queue_size_) { - std::this_thread::yield(); + std::this_thread::sleep_for(std::chrono::milliseconds(1)); } if (node_cnt_.load(std::memory_order_relaxed) >= queue_slow_size_) { - std::this_thread::yield(); + std::this_thread::sleep_for(std::chrono::milliseconds(1)); } Suggestion importance[1-10]: 7 Why: The suggestion to replace busy waiting with a sleep mechanism can improve CPU efficiency and reduce thread starvation. However, it may introduce latency in task scheduling, so the impact on performance should be carefully evaluated.	7

coderabbitai · 2024-06-07T08:17:02Z

Walkthrough

The recent updates to the ThreadPool class involve a substantial overhaul aimed at enhancing task management, worker dynamics, and synchronization. Key modifications include the removal of the Task struct, significant changes to the Worker class, and the introduction of new methods and member variables in the ThreadPool class. These enhancements are designed to optimize task handling, improve resource allocation, and ensure more effective synchronization among threads.

Changes

Files	Change Summary
`conf/pika.conf`	Enhanced configuration with new parameters for caching, thread management, and compaction strategies; increased thread and pool sizes.
`src/net/include/thread_pool.h`	Major restructuring of `ThreadPool` and `Worker` classes; removal of `Task` struct; new structs and method signatures added.
`src/net/src/thread_pool.cc`	Refactoring of `WorkerMain`, `runInThread`, and task scheduling methods to improve efficiency and control flow.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ThreadPool
    participant Worker
    Client->>ThreadPool: Schedule(task)
    ThreadPool->>Worker: Notify worker
    Worker-->>ThreadPool: Fetch task
    Worker->>Worker: Execute task
    Worker-->>ThreadPool: Task complete

Poem

In the code where threads entwine,
A pool revamped, now so divine.
Workers buzz with tasks anew,
Synchronized in perfect queue.
Tasks handled with care and might,
ThreadPool’s future, oh so bright! 🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

Outside diff range and nitpick comments (2)

tests/integration/start_codis.sh (1)
Line range hint 38-38: Add error handling after the cd command to ensure script robustness.
- cd ../codis
+ cd ../codis || exit
src/net/src/thread_pool.cc (1)
Line range hint 102-283: Replace busy waiting with more efficient waiting mechanisms and ensure robust memory management to prevent leaks and ensure thread safety.
- while (node_cnt_.load(std::memory_order_relaxed) >= max_queue_size_) {
-   std::this_thread::yield();
- }
+ std::unique_lock<std::mutex> lock(mu_);
+ rsignal_.wait(lock, [this] { return node_cnt_.load() < max_queue_size_; });

- auto node = new Node(func, arg);
+ auto node = std::make_unique<Node>(func, arg);
- LinkOne(node, &newest_node_);
+ LinkOne(node.get(), &newest_node_);
- node.release();  // Transfer ownership to the linked list

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 5240e45 and 0d1b00f.

Files selected for processing (6)

src/net/include/likely.h (1 hunks)
src/net/include/net_define.h (1 hunks)
src/net/include/random.h (1 hunks)
src/net/include/thread_pool.h (3 hunks)
src/net/src/thread_pool.cc (5 hunks)
tests/integration/start_codis.sh (2 hunks)

Files skipped from review due to trivial changes (2)

src/net/include/likely.h
src/net/include/net_define.h

Additional context used

Shellcheck

tests/integration/start_codis.sh

[warning] 38-38: Use 'cd ... || exit' or 'cd ... || return' in case cd fails. (SC2164)

coderabbitai · 2024-06-07T08:27:02Z

tests/integration/start_codis.sh

-#rm -rf /tmp/codis
-#rm -rf codis_data_1
-#rm -rf codis_data_2
+# sugguest clear these before starting codis


Correct the typo in the comment.

- # sugguest clear these before starting codis + # suggest clear these before starting codis

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change

# sugguest clear these before starting codis

# suggest clear these before starting codis

coderabbitai · 2024-06-07T08:27:02Z

src/net/include/thread_pool.h

+ private:
+  struct Node {
+    Node* link_older = nullptr;
+    Node* link_newer = nullptr;
+
+    // true if task is TimeTask
+    bool is_time_task;
+    TimeTask task;
+
+    Node(TaskFunc func, void* arg) : is_time_task(false), task(0, func, arg) {}
+    Node(uint64_t exec_time, TaskFunc func, void* arg) : is_time_task(true), task(exec_time, func, arg) {}
+
+    inline void Exec() { task.func(task.arg); }
+    inline Node* Next() { return link_newer; }
+  };
+
+  static inline void AsmVolatilePause() {
+#if defined(__i386__) || defined(__x86_64__)
+    asm volatile("pause");
+#elif defined(__aarch64__)
+    asm volatile("wfe");
+#elif defined(__powerpc64__)
+    asm volatile("or 27,27,27");
+#endif
+    // it's okay for other platforms to be no-ops
+  }
+
+  Node* CreateMissingNewerLinks(Node* head);
+  bool LinkOne(Node* node, std::atomic<Node*>* newest_node);
+
+  std::atomic<Node*> newest_node_;
+  std::atomic<int> node_cnt_;  // for task
+  std::atomic<Node*> time_newest_node_;
+  std::atomic<int> time_node_cnt_;  // for time task
+
+  const int queue_slow_size_;  // default value: min(worker_num_ * 10, max_queue_size_)
  size_t max_queue_size_;
+
+  const uint64_t max_yield_usec_;
+  const uint64_t slow_yield_usec_;
+
+  AdaptationContext adp_ctx;
+
+  size_t worker_num_;


Ensure proper memory management for Node objects and correct ordering of atomic operations to prevent race conditions.

- Node* link_older = nullptr; - Node* link_newer = nullptr; + std::atomic<Node*> link_older{nullptr}; + std::atomic<Node*> link_newer{nullptr}; - std::atomic<Node*> newest_node_; - std::atomic<int> node_cnt_; + std::atomic<Node*> newest_node_{nullptr}; + std::atomic<int> node_cnt_{0}; - if (newest_node->compare_exchange_weak(nodes, node)) { + if (newest_node->compare_exchange_weak(nodes, node, std::memory_order_release, std::memory_order_relaxed)) {

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change

private:

struct Node {

Node* link_older = nullptr;

Node* link_newer = nullptr;

// true if task is TimeTask

bool is_time_task;

TimeTask task;

Node(TaskFunc func, void* arg) : is_time_task(false), task(0, func, arg) {}

Node(uint64_t exec_time, TaskFunc func, void* arg) : is_time_task(true), task(exec_time, func, arg) {}

inline void Exec() { task.func(task.arg); }

inline Node* Next() { return link_newer; }

};

static inline void AsmVolatilePause() {

#if defined(__i386__) || defined(__x86_64__)

asm volatile("pause");

#elif defined(__aarch64__)

asm volatile("wfe");

#elif defined(__powerpc64__)

asm volatile("or 27,27,27");

#endif

// it's okay for other platforms to be no-ops

}

Node* CreateMissingNewerLinks(Node* head);

bool LinkOne(Node* node, std::atomic<Node*>* newest_node);

std::atomic<Node*> newest_node_;

std::atomic<int> node_cnt_; // for task

std::atomic<Node*> time_newest_node_;

std::atomic<int> time_node_cnt_; // for time task

const int queue_slow_size_; // default value: min(worker_num_ * 10, max_queue_size_)

size_t max_queue_size_;

const uint64_t max_yield_usec_;

const uint64_t slow_yield_usec_;

AdaptationContext adp_ctx;

size_t worker_num_;

private:

struct Node {

std::atomic<Node*> link_older{nullptr};

std::atomic<Node*> link_newer{nullptr};

// true if task is TimeTask

bool is_time_task;

TimeTask task;

Node(TaskFunc func, void* arg) : is_time_task(false), task(0, func, arg) {}

Node(uint64_t exec_time, TaskFunc func, void* arg) : is_time_task(true), task(exec_time, func, arg) {}

inline void Exec() { task.func(task.arg); }

inline Node* Next() { return link_newer.load(); }

};

static inline void AsmVolatilePause() {

#if defined(__i386__) || defined(__x86_64__)

asm volatile("pause");

#elif defined(__aarch64__)

asm volatile("wfe");

#elif defined(__powerpc64__)

asm volatile("or 27,27,27");

#endif

// it's okay for other platforms to be no-ops

}

Node* CreateMissingNewerLinks(Node* head);

bool LinkOne(Node* node, std::atomic<Node*>* newest_node);

std::atomic<Node*> newest_node_{nullptr};

std::atomic<int> node_cnt_{0}; // for task

std::atomic<Node*> time_newest_node_;

std::atomic<int> time_node_cnt_; // for time task

const int queue_slow_size_; // default value: min(worker_num_ * 10, max_queue_size_)

size_t max_queue_size_;

const uint64_t max_yield_usec_;

const uint64_t slow_yield_usec_;

AdaptationContext adp_ctx;

size_t worker_num_;

QlQlqiqi · 2024-06-18T14:14:16Z

这些都是一个 worker 一个无锁链表的测试（具体上面讨论有描述），代码：https://github.com/QlQlqiqi/pika/tree/change-thread-shedule-with-mutil-list

关闭 binlog、cache

redis-benchmark -h localhost -p 9221 -t get,set -n 5000000 -r 10000000000000 -d 512 -c 300 --threads 150

修改参数

kMaxSlowYieldsWhileSpinning、slow_yield_usec_、max_yield_usec_、指令类型

6 6 500 get
Summary:
  throughput summary: 121294.45 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        2.452     0.016     1.911     5.975    11.775   137.087

6 6 500 set
Summary:
  throughput summary: 22597.34 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
       13.178     0.080     6.711    30.463    54.175  2711.551

3 3 100 get
Summary:
  throughput summary: 122249.38 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        2.424     0.016     1.743     6.375    14.431   151.551

3 3 100 set
Summary:
  throughput summary: 20026.92 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
       14.806     0.064     5.903    29.727    52.831  3000.319

通过调参，可以降低一部分 get 的 p99 和 set 的 qps，但是 get 的 qps 和 set 的 p99 会提升。这块参数是和机器性能有关的，我本地测试机器性能一般，通过提高二级等待时间，可以适度提高效果。

或者可以通过牺牲 set 来换取 get 性能

如果恒让 yield_credit 为正数（即：总是进入二级等待），那么 get 的 qps 会提高，p99 会降低，对比如下：

Summary:
  throughput summary: 122249.38 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        2.424     0.016     1.743     6.375    14.431   151.551

Summary:
  throughput summary: 131887.84 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        2.256     0.024     1.879     5.055     8.855   159.615

因为总是进入二级等待，几乎可以让 get 测试不会触发 cv.wait，耗时减少；但是，这样会让本身就较为耗时的 set 总是在等待任务，占据了那些处理数据本身的时间，并且因为 set 执行慢，二级等待后可能依旧触发 cv.wait，从而导致慢。

QlQlqiqi · 2024-06-18T15:08:37Z

当前分支成为 v1，一个 worker 一个无锁链表成为 v2。

v1 的 qps 低于 v2，但是 p99 也低于 v2，原因为：通过 gperftools 查看，v2 中有很大一部分耗时是 AsmVolatilePause，所以这就说明，v2 多个无锁链表，会导致等待效率降低，从而提高了 p99，但是一个链表又竞争过大，qps 也会低，所以需要取一个中间值，如：30 个 worker 有 5 个无锁链表，这样应该会在不降低 qps 的同时降低 p99。

v1 是所有的 worker 对应一个链表，v2 是每一个 worker 对应一个链表
这里是 worker num 为 2 的 v1 和 v2 的 get 测试：

v1:
Summary:
  throughput summary: 82895.37 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        3.580     0.024     2.815     9.487    17.535   309.503

v2:
Summary:
  throughput summary: 91702.74 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        3.245     0.016     2.887     7.719     9.167   203.263

QlQlqiqi · 2024-06-19T13:54:36Z

就像上一个讨论说的，多个无锁链表会减弱三级等待成功率，所以，可以尝试少量 worker 对应一个无锁链表：

get：一共 30 个 worker，每 2 个 worker 对应一个链表
Summary:
  throughput summary: 133697.00 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        2.213     0.016     1.615     6.279    10.711   173.951


get：一共 30 个 worker，每 3 个 worker 对应一个链表
Summary:
  throughput summary: 86218.79 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        3.450     0.024     2.071    11.311    16.927   152.191

可以看到，在我的机器上 2 个无锁链表的竞争会达到最大。
https://github.com/QlQlqiqi/pika/tree/change-thread-shedule-with-mutil-list-per-worker

…change-thread-shedule

QlQlqiqi · 2024-06-26T08:53:55Z

测试总结：

关闭 binlog、cache，30 个 worker
测试命令：redis-benchmark -h localhost -p 9221 -t get,set -n 5000000 -r 10000000000000 -d 512 -c 300 --threads 150

所有的 worker 对应一个链表：这样竞争过大，CPU 更多的会去执行 Schedule 中的 std::this_thread::yield();，因为 worker 中待处理任务过多，需要让出线程，这样 CPU 切换会增大耗时。并且因为 CPU 去处理线程切换，其他已经从链表取出来的任务就得不到执行，从而减少等待效率，白白消耗第一级的等待（AsmVolatilePause()）时间。
nworkers_per_link_ 个 worker对应一个链表：根据机器性能和场景决定参数值，过大会因为竞争压力过大、导致等待效率降低，过小会因为竞争压力过小、链表中取数据概率降低、从而导致等待效率降低。在我的机器上（AMD7900x，ubuntu2204.3，Debug版本的 pika，DDR5-6400 内存条的环境下，这个值为 2 是最好的）

至于程序中其他参数，如：kMaxSlowYieldsWhileSpinning、slow_yield_usec_、max_yield_usec_ 等，更改参数也可以改变等待逻辑的时长，从而根据机器性能和场景选择合适的等待效率。

Issues-translate-bot · 2024-06-26T08:54:07Z

Bot detected the issue body's language is not English, translate it automatically.

Test summary:

Close binlog, cache, 30 workers
Test command: redis-benchmark -h localhost -p 9221 -t get,set -n 5000000 -r 10000000000000 -d 512 -c 300 --threads 150

All workers correspond to a linked list: If the competition is too large, the CPU will execute more std::this_thread::yield(); in Schedule because there are too many tasks to be processed in the workers and need to be processed Threads are removed, so CPU switching will increase time consumption. And because the CPU handles thread switching, other tasks that have been taken out from the linked list will not be executed, thereby reducing the waiting efficiency and consuming the first-level waiting (AsmVolatilePause()) time in vain.
nworkers_per_link_ workers correspond to a linked list: the parameter value is determined according to the machine performance and scenario. If it is too large, the waiting efficiency will be reduced due to excessive competition pressure. If it is too small, the probability of fetching data from the linked list will be reduced due to too small competition pressure, thus reducing the probability of fetching data from the linked list. This results in reduced waiting efficiency. On my machine (AMD7900x, ubuntu2204.3, Debug version of pika, DDR5-6400 memory module, this value is 2 is the best)

As for other parameters in the program, such as: kMaxSlowYieldsWhileSpinning, slow_yield_usec_, max_yield_usec_, etc., changing the parameters can also change the length of the waiting logic, thereby selecting the appropriate waiting efficiency according to the machine performance and scenario.

coderabbitai

Actionable comments posted: 4

Outside diff range and nitpick comments (1)

tests/integration/start_codis.sh (1)
Line range hint 38-38: Add error handling for the cd command.

The script should handle the possibility of the cd command failing, which would prevent errors in subsequent commands if the directory does not exist or is not accessible.
- cd ../codis
+ cd ../codis || exit

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 0d1b00f and 9dab5c4.

Files selected for processing (6)

src/net/include/likely.h (1 hunks)
src/net/include/net_define.h (1 hunks)
src/net/include/random.h (1 hunks)
src/net/include/thread_pool.h (4 hunks)
src/net/src/thread_pool.cc (5 hunks)
tests/integration/start_codis.sh (2 hunks)

Files skipped from review due to trivial changes (2)

src/net/include/likely.h
src/net/include/net_define.h

Additional context used

Shellcheck

tests/integration/start_codis.sh

[warning] 38-38: Use 'cd ... || exit' or 'cd ... || return' in case cd fails. (SC2164)

Additional comments not posted (1)

tests/integration/start_codis.sh (1)
3-3: Correct the typo in the comment.

The word "sugguest" should be corrected to "suggest".
- # sugguest clear these before starting codis
+ # suggest clear these before starting codis
Likely invalid or redundant comment.

coderabbitai · 2024-06-26T09:02:17Z

src/net/include/random.h

+    static __thread Random* tls_instance;
+    static __thread std::aligned_storage<sizeof(Random)>::type tls_instance_bytes;
+
+    auto rv = tls_instance;
+    if (UNLIKELY(rv == nullptr)) {
+      size_t seed = std::hash<std::thread::id>()(std::this_thread::get_id());
+      rv = new (&tls_instance_bytes) Random((uint32_t)seed);
+      tls_instance = rv;
+    }


Use smart pointers for thread-local storage.

To manage memory more safely and efficiently, consider using smart pointers instead of raw pointers for the thread-local instance of the Random class.

- static __thread Random* tls_instance; - static __thread std::aligned_storage<sizeof(Random)>::type tls_instance_bytes; + static thread_local std::unique_ptr<Random> tls_instance; - auto rv = tls_instance; - if (UNLIKELY(rv == nullptr)) { - size_t seed = std::hash<std::thread::id>()(std::this_thread::get_id()); - rv = new (&tls_instance_bytes) Random((uint32_t)seed); - tls_instance = rv; - } + if (!tls_instance) { + size_t seed = std::hash<std::thread::id>()(std::this_thread::get_id()); + tls_instance = std::make_unique<Random>((uint32_t)seed); + } + return tls_instance.get();

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

static __thread Random* tls_instance;

static __thread std::aligned_storage<sizeof(Random)>::type tls_instance_bytes;

auto rv = tls_instance;

if (UNLIKELY(rv == nullptr)) {

size_t seed = std::hash<std::thread::id>()(std::this_thread::get_id());

rv = new (&tls_instance_bytes) Random((uint32_t)seed);

tls_instance = rv;

}

static thread_local std::unique_ptr<Random> tls_instance;

if (!tls_instance) {

size_t seed = std::hash<std::thread::id>()(std::this_thread::get_id());

tls_instance = std::make_unique<Random>((uint32_t)seed);

}

return tls_instance.get();

coderabbitai · 2024-06-26T09:02:17Z

src/net/include/random.h

@@ -0,0 +1,88 @@
+// Copyright (c) 2024-present, Qihoo, Inc.  All rights reserved.


Add a license declaration at the start of the file.

As suggested in previous comments, a license declaration should be consistently present at the start of each file.

+ // Copyright (c) 2024-present, OpenAtom Foundation. All rights reserved. + // This source code is licensed under the BSD-style license found in the + // LICENSE file in the root directory of this source tree. An additional grant + // of patent rights can be found in the PATENTS file in the same directory.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Copyright (c) 2024-present, Qihoo, Inc. All rights reserved.

// Copyright (c) 2024-present, OpenAtom Foundation. All rights reserved.

// This source code is licensed under the BSD-style license found in the

// LICENSE file in the root directory of this source tree. An additional grant

// of patent rights can be found in the PATENTS file in the same directory.

// Copyright (c) 2024-present, Qihoo, Inc. All rights reserved.

coderabbitai · 2024-06-26T09:02:17Z

src/net/include/thread_pool.h

+  struct Node {
+    Node* link_older = nullptr;
+    Node* link_newer = nullptr;

-  size_t worker_num_;
+    // true if task is TimeTask
+    bool is_time_task;
+    TimeTask task;
+
+    Node(TaskFunc func, void* arg) : is_time_task(false), task(0, func, arg) {}
+    Node(uint64_t exec_time, TaskFunc func, void* arg) : is_time_task(true), task(exec_time, func, arg) {}
+
+    inline void Exec() { task.func(task.arg); }
+    inline Node* Next() { return link_newer; }
+  };
+
+  static inline void AsmVolatilePause() {
+#if defined(__i386__) || defined(__x86_64__)
+    asm volatile("pause");
+#elif defined(__aarch64__)
+    asm volatile("wfe");
+#elif defined(__powerpc64__)
+    asm volatile("or 27,27,27");
+#endif
+    // it's okay for other platforms to be no-ops
+  }
+
+  Node* CreateMissingNewerLinks(Node* head, int* cnt);
+  bool LinkOne(Node* node, std::atomic<Node*>* newest_node);
+


Ensure proper memory management for Node objects and correct ordering of atomic operations.

To prevent memory leaks and race conditions, consider using smart pointers for Node objects and ensure correct memory order tags for atomic operations.

- Node* link_older = nullptr; - Node* link_newer = nullptr; + std::atomic<Node*> link_older{nullptr}; + std::atomic<Node*> link_newer{nullptr}; - std::atomic<Node*> newest_node_; - std::atomic<int> node_cnt_; + std::atomic<Node*> newest_node_{nullptr}; + std::atomic<int> node_cnt_{0}; - if (newest_node->compare_exchange_weak(nodes, node)) { + if (newest_node->compare_exchange_weak(nodes, node, std::memory_order_release, std::memory_order_relaxed)) {

Committable suggestion was skipped due to low confidence.

QlQlqiqi · 2024-06-27T11:06:42Z

timer task 可能 blocking worker thread，因为它总是 wait for 一定时间，如果在此期间有新的 task，该 worker thread 依旧 wait。所以，在 wait 的时候判断是否有新的 task 到来，如果有，则将已经从链表中取出来的 timer tasks 重新 push 回链表中，并执行 task 逻辑，以保证 worker 线程不会被阻塞。（目前这里没有设置多少 timer tasks 以内才会被 re-push 回链表）

          lock.lock();
          // if task is coming now, do task immediately
          auto res = rsignal.wait_for(lock, std::chrono::microseconds(exec_time - unow), [this, &newest_node]() {
            return newest_node.load(std::memory_order_relaxed) != nullptr || UNLIKELY(should_stop());
          });
          lock.unlock();
          if (res) {
            // re-push the timer tasks
            ReDelaySchedule(time_first);
            goto retry;
          }

coderabbitai

Actionable comments posted: 7

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 9dab5c4 and 8694da0.

Files selected for processing (2)

src/net/include/thread_pool.h (4 hunks)
src/net/src/thread_pool.cc (5 hunks)

Additional comments not posted (10)

src/net/include/thread_pool.h (7)

12-12: Consider sorting imports for better readability.

Imports like <vector> are typically grouped and sorted with other standard library includes. This can improve readability and maintainability of the code.

20-20: Modernize function pointer declaration.

Using using for type aliasing improves readability and is more in line with modern C++ standards.

34-38: Introduce a constructor for Arg struct.

The explicit constructor initializes arg and idx, ensuring that objects of this struct are always in a valid state when created.

51-52: Initialization of Worker class members.

The members idx_ and arg_ are correctly initialized in the constructor, ensuring that each worker has a unique index and associated argument.

72-73: Method to run tasks in a thread is well encapsulated.

The runInThread method allows for tasks to be executed in a specific thread based on the index, which is a good design for load balancing across workers.

75-79: Introduction of AdaptationContext struct.

This struct is used to manage adaptive thresholds for task execution, which is crucial for optimizing performance under different load conditions.

100-109: Platform-specific pause operations are correctly implemented.

The use of assembly instructions for pausing the CPU is appropriate for reducing CPU usage during spin-wait loops.
src/net/src/thread_pool.cc (3)
17-20: Correct implementation of WorkerMain.

The method correctly casts the argument and calls runInThread with the appropriate worker index, which is essential for correct task execution.

93-95: Proper cleanup in stop_thread_pool.

Signaling all condition variables before joining and deleting worker threads is a good practice to ensure clean shutdown.

77-78: Worker initialization in the thread pool.

Workers are dynamically allocated and started correctly. However, ensure that there is a corresponding deallocation to prevent memory leaks.

Verification successful

Worker initialization in the thread pool.

The Worker objects are properly deallocated, as evidenced by the delete worker statements found in the codebase.

src/net/src/thread_pool.cc: delete worker;
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify memory management for workers.

# Test: Search for worker deallocation.
rg --type py 'delete worker'
Length of output: 30

Script:
#!/bin/bash
# Description: Verify memory management for workers.

# Test: Search for worker deallocation in C++ files.
rg --type cpp 'delete worker'
Length of output: 227

coderabbitai · 2024-06-27T11:08:19Z

src/net/include/thread_pool.h

+      int idx;
+    };
+
+    explicit Worker(ThreadPool* tp, int idx = 0) : start_(false), thread_pool_(tp), idx_(idx), arg_(tp, idx){};


Ensure thread safety during worker initialization.

The initialization of Worker objects includes setting start_ to false, which is correct. However, consider using std::atomic for start_ to ensure thread safety during checks and updates.

coderabbitai · 2024-06-27T11:08:19Z

src/net/include/thread_pool.h

  };

-  explicit ThreadPool(size_t worker_num, size_t max_queue_size, std::string  thread_pool_name = "ThreadPool");
+  explicit ThreadPool(size_t worker_num, size_t max_queue_size, std::string thread_pool_name = "ThreadPool");


ThreadPool constructor lacks explicit member initializations.

The constructor should explicitly initialize all its members to ensure that all objects are in a predictable state. This is particularly important for synchronization primitives like mutexes and condition variables.

- explicit ThreadPool(size_t worker_num, size_t max_queue_size, std::string thread_pool_name = "ThreadPool"); + explicit ThreadPool(size_t worker_num, size_t max_queue_size, std::string thread_pool_name = "ThreadPool") : worker_num_(worker_num), max_queue_size_(max_queue_size), thread_pool_name_(std::move(thread_pool_name)), running_(false), should_stop_(false) {}

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

explicit ThreadPool(size_t worker_num, size_t max_queue_size, std::string thread_pool_name = "ThreadPool");

explicit ThreadPool(size_t worker_num, size_t max_queue_size, std::string thread_pool_name = "ThreadPool") : worker_num_(worker_num), max_queue_size_(max_queue_size), thread_pool_name_(std::move(thread_pool_name)), running_(false), should_stop_(false) {}

coderabbitai · 2024-06-27T11:08:19Z

src/net/include/thread_pool.h

+  struct Node {
+    Node* link_older = nullptr;
+    Node* link_newer = nullptr;

-  size_t worker_num_;
+    // true if task is TimeTask
+    bool is_time_task;
+    TimeTask task;
+
+    Node(TaskFunc func, void* arg) : is_time_task(false), task(0, func, arg) {}
+    Node(uint64_t exec_time, TaskFunc func, void* arg) : is_time_task(true), task(exec_time, func, arg) {}
+
+    inline void Exec() { task.func(task.arg); }
+    inline Node* Next() { return link_newer; }
+  };


Memory management for Node objects needs attention.

The Node struct handles tasks but does not use smart pointers for managing dynamic memory, which can lead to memory leaks.

- Node* link_older = nullptr; - Node* link_newer = nullptr; + std::unique_ptr<Node> link_older; + std::unique_ptr<Node> link_newer;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

struct Node {

Node* link_older = nullptr;

Node* link_newer = nullptr;

size_t worker_num_;

// true if task is TimeTask

bool is_time_task;

TimeTask task;

Node(TaskFunc func, void* arg) : is_time_task(false), task(0, func, arg) {}

Node(uint64_t exec_time, TaskFunc func, void* arg) : is_time_task(true), task(exec_time, func, arg) {}

inline void Exec() { task.func(task.arg); }

inline Node* Next() { return link_newer; }

};

struct Node {

std::unique_ptr<Node> link_older;

std::unique_ptr<Node> link_newer;

// true if task is TimeTask

bool is_time_task;

TimeTask task;

Node(TaskFunc func, void* arg) : is_time_task(false), task(0, func, arg) {}

Node(uint64_t exec_time, TaskFunc func, void* arg) : is_time_task(true), task(exec_time, func, arg) {}

inline void Exec() { task.func(task.arg); }

inline Node* Next() { return link_newer.get(); }

};

coderabbitai · 2024-06-27T11:08:19Z

src/net/include/thread_pool.h

+  Node* CreateMissingNewerLinks(Node* head, int* cnt);
+  bool LinkOne(Node* node, std::atomic<Node*>* newest_node);


Tip
Codebase Verification

Link management methods need thorough testing.

The methods CreateMissingNewerLinks and LinkOne are critical for maintaining the integrity of the task list. Ensure these methods are covered by unit tests to prevent issues in task scheduling.

No unit tests found for CreateMissingNewerLinks.

No unit tests found for LinkOne.

Analysis chain

Link management methods need thorough testing.

The methods CreateMissingNewerLinks and LinkOne are critical for maintaining the integrity of the task list. Ensure these methods are covered by unit tests to prevent issues in task scheduling.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Verify unit tests for link management methods. # Test: Search for tests related to `CreateMissingNewerLinks` and `LinkOne`. rg --type py 'test_CreateMissingNewerLinks|test_LinkOne'

Length of output: 58

Script:

#!/bin/bash # Description: Verify unit tests for link management methods. # Test: Search for tests related to `CreateMissingNewerLinks` and `LinkOne` in all test files. rg 'CreateMissingNewerLinks|LinkOne' --type-add 'test:*.{cpp,h,py}' --type test

Length of output: 257

coderabbitai · 2024-06-27T11:08:19Z

src/net/src/thread_pool.cc

+
+          if (newest_node.load(std::memory_order_acquire) != nullptr) {
+            last = newest_node.exchange(nullptr);
+            if (last != nullptr) {
+              would_spin_again = true;
+              // success
+              break;
+            }
+          }
+          if (UNLIKELY(time_newest_node.load(std::memory_order_acquire) != nullptr)) {
+            time_last = time_newest_node.exchange(nullptr);
+            if (time_last != nullptr) {
+              would_spin_again = true;
+              // success
+              break;
+            }
+          }
+
+          auto now = std::chrono::steady_clock::now();
+          if (now == iter_begin || now - iter_begin >= std::chrono::microseconds(slow_yield_usec_)) {
+            ++slow_yield_count;
+            if (slow_yield_count >= kMaxSlowYieldsWhileSpinning) {
+              update_ctx = true;
+              break;
+            }
+          }
+          iter_begin = now;
+        }
+      }
+
+      // update percentage of next loop 2
+      if (update_ctx) {
+        auto v = yield_credit.load(std::memory_order_relaxed);
+        v = v - (v / 1024) + (would_spin_again ? 1 : -1) * 131072;
+        yield_credit.store(v, std::memory_order_relaxed);
+      }
+
+      if (!would_spin_again) {
+        // 3. wait for new task
        continue;
      }
    }

-    if (!queue_.empty()) {
-      auto [func, arg] = queue_.front();
-      queue_.pop();
-      wsignal_.notify_one();
-      lock.unlock();
-      (*func)(arg);
+  exec:
+    // do all normal tasks older than this task pointed last
+    if (LIKELY(last != nullptr)) {
+      int cnt = 1;
+      auto first = CreateMissingNewerLinks(last, &cnt);
+      // node_cnt_ -= cnt;
+      assert(!first->is_time_task);
+      do {
+        first->Exec();
+        tmp = first;
+        first = first->Next();
+        node_cnt_--;
+        delete tmp;
+      } while (first != nullptr);
+    }
+
+    // do all time tasks older than this task pointed time_last
+    if (UNLIKELY(time_last != nullptr)) {
+      int cnt = 1;
+      auto time_first = CreateMissingNewerLinks(time_last, &cnt);
+      do {
+        // time task may block normal task
+        auto now = std::chrono::system_clock::now();
+        uint64_t unow = std::chrono::duration_cast<std::chrono::microseconds>(now.time_since_epoch()).count();
+
+        auto [exec_time, func, arg] = time_first->task;
+        assert(time_first->is_time_task);
+        if (unow >= exec_time) {
+          time_first->Exec();
+        } else {
+          lock.lock();
+          // if task is coming now, do task immediately
+          auto res = rsignal.wait_for(lock, std::chrono::microseconds(exec_time - unow), [this, &newest_node]() {
+            return newest_node.load(std::memory_order_relaxed) != nullptr || UNLIKELY(should_stop());
+          });
+          lock.unlock();
+          if (res) {
+            // re-push the timer tasks
+            ReDelaySchedule(time_first);
+            goto retry;
+          }
+          time_first->Exec();
+        }
+        tmp = time_first;
+        time_first = time_first->Next();
+        time_node_cnt_--;
+        delete tmp;
+      } while (time_first != nullptr);
+    }
+    goto retry;
+  }


Complex task handling logic in runInThread.

The method handles both normal and time-based tasks with appropriate checks and synchronization. However, the complexity of this method could benefit from further comments or refactoring to improve readability.

coderabbitai · 2024-06-27T11:08:19Z

src/net/src/thread_pool.cc

+  // stop until the size of tasks queue is not greater than max_queue_size_
+  while (node_cnt_.load(std::memory_order_relaxed) >= max_queue_size_) {
+    std::this_thread::yield();
+  }
+  // slow like above
+  if (node_cnt_.load(std::memory_order_relaxed) >= queue_slow_size_) {
+    std::this_thread::yield();
+  }

-  if (!should_stop()) {
-    queue_.emplace(func, arg);
-    rsignal_.notify_one();
+  if (LIKELY(!should_stop())) {
+    auto node = new Node(func, arg);
+    auto idx = ++task_idx_;
+    LinkOne(node, &newest_node_[idx % nlinks_]);
+    node_cnt_++;
+    rsignal_[idx % nlinks_].notify_one();


Task scheduling needs optimization to reduce busy-waiting.

The use of std::this_thread::yield() in a loop can lead to high CPU usage. Consider using condition variables or other synchronization mechanisms to reduce busy-waiting.

- while (node_cnt_.load(std::memory_order_relaxed) >= max_queue_size_) { - std::this_thread::yield(); + std::unique_lock lock(mu_); + rsignal_.wait(lock, [this]{ return node_cnt_.load(std::memory_order_relaxed) < max_queue_size_; });

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// stop until the size of tasks queue is not greater than max_queue_size_

while (node_cnt_.load(std::memory_order_relaxed) >= max_queue_size_) {

std::this_thread::yield();

}

// slow like above

if (node_cnt_.load(std::memory_order_relaxed) >= queue_slow_size_) {

std::this_thread::yield();

}

if (!should_stop()) {

queue_.emplace(func, arg);

rsignal_.notify_one();

if (LIKELY(!should_stop())) {

auto node = new Node(func, arg);

auto idx = ++task_idx_;

LinkOne(node, &newest_node_[idx % nlinks_]);

node_cnt_++;

rsignal_[idx % nlinks_].notify_one();

// stop until the size of tasks queue is not greater than max_queue_size_

std::unique_lock lock(mu_);

rsignal_.wait(lock, [this]{ return node_cnt_.load(std::memory_order_relaxed) < max_queue_size_; });

// slow like above

if (node_cnt_.load(std::memory_order_relaxed) >= queue_slow_size_) {

std::this_thread::yield();

}

if (LIKELY(!should_stop())) {

auto node = new Node(func, arg);

auto idx = ++task_idx_;

LinkOne(node, &newest_node_[idx % nlinks_]);

node_cnt_++;

rsignal_[idx % nlinks_].notify_one();

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 8694da0 and 055a479.

Files selected for processing (2)

src/net/include/thread_pool.h (4 hunks)
src/net/src/thread_pool.cc (5 hunks)

Additional comments not posted (20)

src/net/include/thread_pool.h (10)

34-38: Ensure correct initialization of Arg struct.

The Arg struct is correctly initialized with the provided arguments.

40-40: Ensure thread safety during worker initialization.

The initialization of Worker objects includes setting start_ to false, which is correct. However, consider using std::atomic for start_ to ensure thread safety during checks and updates.

55-55: ThreadPool constructor lacks explicit member initializations.

The constructor should explicitly initialize all its members to ensure that all objects are in a predictable state. This is particularly important for synchronization primitives like mutexes and condition variables.

72-72: Ensure proper memory management for Node objects and correct ordering of atomic operations.

To prevent memory leaks and race conditions, consider using smart pointers for Node objects and ensure correct memory order tags for atomic operations.

111-112: Link management methods need thorough testing.

The methods CreateMissingNewerLinks and LinkOne are critical for maintaining the integrity of the task list. Ensure these methods are covered by unit tests to prevent issues in task scheduling.

72-73: Complex task handling logic in runInThread.

The method handles both normal and time-based tasks with appropriate checks and synchronization. However, the complexity of this method could benefit from further comments or refactoring to improve readability.

97-99: LGTM!

The ReDelaySchedule method correctly handles re-scheduling of timer tasks.

100-109: LGTM!

The AsmVolatilePause method correctly uses architecture-specific instructions for pausing execution.

111-112: LGTM!

The CreateMissingNewerLinks method correctly handles link creation in the task list.

112-113: LGTM!

The LinkOne method correctly handles node linking in the task list.

src/net/src/thread_pool.cc (10)

17-20: LGTM!

The WorkerMain method correctly handles the worker's main execution loop with the updated argument.

26-26: Error handling in thread creation needs improvement.

When pthread_create fails, consider logging the error or throwing an exception to help with debugging and fault tolerance.

Line range hint 36-46: LGTM!

The stop method correctly handles stopping the worker thread.

48-70: ThreadPool constructor lacks explicit member initializations.

The constructor should explicitly initialize all its members to ensure that all objects are in a predictable state. This is particularly important for synchronization primitives like mutexes and condition variables.

Line range hint 76-85: LGTM!

The start_thread_pool method correctly handles starting the worker threads.

Line range hint 92-108: LGTM!

The stop_thread_pool method correctly handles stopping the worker threads and clearing the worker list.

115-129: Task scheduling needs optimization to reduce busy-waiting.

The use of std::this_thread::yield() in a loop can lead to high CPU usage. Consider using condition variables or other synchronization mechanisms to reduce busy-waiting.

140-146: LGTM!

The DelaySchedule method correctly handles scheduling delayed tasks in the thread pool.

160-313: Complex task handling logic in runInThread.

The method handles both normal and time-based tasks with appropriate checks and synchronization. However, the complexity of this method could benefit from further comments or refactoring to improve readability.

316-350: LGTM!

The ReDelaySchedule, CreateMissingNewerLinks, and LinkOne methods correctly handle their respective tasks.

QlQlqiqi · 2024-07-13T03:53:54Z

在我这测试结果如下，命令为：redis-benchmark -h localhost -p 9221 -t get,set -n 10000000 -r 10000000000000 -d 512 -c 32 --threads 16

关闭 binlog 和 cache，worker num 为 30
queue_slow_size_为：(std::max(10UL, std::min(worker_num * max_queue_size / 100, max_queue_size))),

unstable branch, nworkers_per_link_ = 1:
====== SET ======
Summary:
  throughput summary: 31475.25 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        1.008     0.080     0.895     1.815     2.991    42.175
====== GET ======
Summary:
  throughput summary: 70437.91 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.443     0.040     0.359     1.039     1.703    21.743

原版 pika branch:
====== SET ======
Summary:
  throughput summary: 29344.70 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        1.082     0.072     0.999     1.887     2.583    38.271
====== GET ======
Summary:
  throughput summary: 34520.49 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.918     0.040     0.711     2.575     3.831    72.255

结论

set 是 QPS 和 P99 均有优化
get 是 QPS 有些许优化，但是 P99 反向优化。

coderabbitai

Actionable comments posted: 2

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 055a479 and 30e2a96.

Files selected for processing (2)

conf/pika.conf (1 hunks)
src/net/src/thread_pool.cc (5 hunks)

Additional comments not posted (16)

src/net/src/thread_pool.cc (7)

19-22: LGTM! The change enhances the function's flexibility.

The function now accepts a structured argument, improving control over task execution.

50-72: LGTM! The constructor refactoring enhances adaptability.

The changes introduce a more complex internal state, improving the pool's adaptability to varying workloads.

81-81: LGTM! The dynamic calculation of the maximum queue size is a good enhancement.

This change improves the pool's adaptability to varying workloads.

96-98: LGTM! The notification mechanism for stopping worker threads is a good enhancement.

This change ensures that all worker threads are correctly stopped.

145-150: LGTM! The notification mechanism for delayed tasks is a good enhancement.

This change ensures that worker threads are correctly notified of new tasks.

164-202: LGTM! The retry mechanism ensures all tasks are executed.

This change improves the thread pool's responsiveness and throughput.

205-215: LGTM! The notification mechanism for re-queued tasks is a good enhancement.

This change ensures that worker threads are correctly notified of re-queued tasks.

conf/pika.conf (9)

24-24: LGTM! Increasing the number of threads enhances concurrency.

This change allows for greater concurrency in handling requests.

28-28: LGTM! Increasing the thread pool size enhances concurrency.

This change indicates a substantial increase in the number of threads available for user requests.

567-567: Verify the decrease in cache max memory.

Decreasing the cache max memory may affect performance. Ensure that this change is intentional and based on performance testing.

718-718: LGTM! Changing the compaction strategy may improve performance.

This change may enhance performance during data writes and reads.

180-180: LGTM! Increasing the number of log files enhances log retention.

This change may improve log retention and analysis.

375-375: LGTM! Increasing the number of background jobs enhances concurrency.

This change allows for more concurrent background operations.

381-381: LGTM! Changing the number of background flushes may improve performance.

This change may enhance performance during data writes.

390-390: LGTM! Changing the number of background compactions may improve performance.

This change may enhance performance during data writes.

676-676: Verify the increase in write buffer size.

Increasing the write buffer size may improve write performance but could generate heavier IO load when flushing from buffer to disk. Ensure that this change is intentional and based on performance testing.

coderabbitai · 2024-07-22T13:35:40Z

src/net/src/thread_pool.cc

  return nullptr;
 }

 int ThreadPool::Worker::start() {
  if (!start_.load()) {
-    if (pthread_create(&thread_id_, nullptr, &WorkerMain, thread_pool_) != 0) {
+    if (pthread_create(&thread_id_, nullptr, &WorkerMain, &arg_) != 0) {


Consider improving error handling in thread creation.

When pthread_create fails, consider logging the error or throwing an exception to help with debugging and fault tolerance.

- if (pthread_create(&thread_id_, nullptr, &WorkerMain, &arg_) != 0) { + if (int err = pthread_create(&thread_id_, nullptr, &WorkerMain, &arg_); err != 0) { + throw std::runtime_error("Failed to create thread: " + std::strerror(err)); }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (pthread_create(&thread_id_, nullptr, &WorkerMain, &arg_) != 0) {

if (int err = pthread_create(&thread_id_, nullptr, &WorkerMain, &arg_); err != 0) {

throw std::runtime_error("Failed to create thread: " + std::strerror(err));

}

coderabbitai · 2024-07-22T13:35:40Z

src/net/src/thread_pool.cc

+  node_cnt_++;
+  // stop until the size of tasks queue is not greater than max_queue_size_
+  while (node_cnt_.load(std::memory_order_relaxed) >= max_queue_size_) {
+    std::this_thread::yield();
+    // pstd::SleepForMicroseconds(1);
+  }
+  // slow like above
+  if (node_cnt_.load(std::memory_order_relaxed) >= queue_slow_size_) {
+    std::this_thread::yield();
+  }


Consider using condition variables to reduce busy-waiting.

The use of std::this_thread::yield() in a loop can lead to high CPU usage. Consider using condition variables or other synchronization mechanisms to reduce busy-waiting.

- while (node_cnt_.load(std::memory_order_relaxed) >= max_queue_size_) { - std::this_thread::yield(); + std::unique_lock lock(mu_); + rsignal_.wait(lock, [this]{ return node_cnt_.load(std::memory_order_relaxed) < max_queue_size_; });

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

node_cnt_++;

// stop until the size of tasks queue is not greater than max_queue_size_

while (node_cnt_.load(std::memory_order_relaxed) >= max_queue_size_) {

std::this_thread::yield();

// pstd::SleepForMicroseconds(1);

}

// slow like above

if (node_cnt_.load(std::memory_order_relaxed) >= queue_slow_size_) {

std::this_thread::yield();

}

node_cnt_++;

// stop until the size of tasks queue is not greater than max_queue_size_

std::unique_lock lock(mu_);

rsignal_.wait(lock, [this]{ return node_cnt_.load(std::memory_order_relaxed) < max_queue_size_; });

// slow like above

if (node_cnt_.load(std::memory_order_relaxed) >= queue_slow_size_) {

std::this_thread::yield();

}

QlQlqiqi · 2024-09-06T03:26:42Z

OS: ubuntu 2204.3
CPU: 7900X
memory: 16G * 2
使用的是如下 conf

# 360 dba pika conf pika3.5.2
port : 9221
thread-num : 8
log-path : ./log/
loglevel : info
db-path : ./db/
write-buffer-size : 256M
timeout : 30
#requirepass : 06154eee364854d5
#masterauth : 06154eee364854d5
#userpass : 06154eee364854d5360
#userblacklist : bgsave,dumpoff,client
dump-prefix : pika-
dump-expire : 1
pidfile : .pika.pid
daemonize : yes
dump-path : ./dump/block_cache_size
maxclients : 20000
target-file-size-base : 20971520
expire-logs-days : 7
expire-logs-nums : 300
root-connection-num : 10
slowlog-log-slower-than : 100000
binlog-file-size : 104857600
compression : snappy
db-sync-path : ./dbsync
db-sync-speed : 60
slowlog-write-errorlog : yes
small-compaction-threshold : 5000
max-write-buffer-size : 20737418240
max-cache-files : 8000
replication-num : 0
consensus-level : 0
max-cache-statistic-keys : 0
thread-pool-size : 50
slowlog-write-errorlog : yes
default-slot-num : 1024
instance-mode : classic
databases : 1
sync-thread-num : 1
arena-block-size : 33554432
max-background-jobs : 12
max-background-flushes : 3
max-background-compactions : 9
rate-limiter-bandwidth : 1099511627776
db-instance-num : 1
block-size : 4096
#block-cache : 5368709120
block-cache : 4294967296
max-subcompactions : 8

#cache-maxmemory : 5368709120
cache-lfu-decay-time : 1
cache-maxmemory-samples : 5
cache-maxmemory-policy : 1
cache-num : 8
cache-model : 0
zset-cache-field-num-per-key : 512
zset-cache-start-direction : 0
cache-type :

share-block-cache : yes
throttle-bytes-per-second : 102400000
max-rsync-parallel-num : 4
write-binlog : no
slotmigrate : no

# Pika automatic compact compact strategy, a complement to rocksdb compact.
# Trigger the compact background task periodically according to `compact-interval`
# Can choose `full-compact` or `obd-compact`.
# obd-compact https://github.com/OpenAtomFoundation/pika/issues/2255
compaction-strategy : obd-compact

# For OBD_Compact 
# According to the number of sst files in rocksdb, 
# compact every `compact-every-num-of-files` file.
compact-every-num-of-files : 50

# For OBD_Compact 
# In another search, if the file creation time is 
# greater than `force-compact-file-age-seconds`, 
# a compaction of the upper and lower boundaries 
# of the file will be performed at the same time 
# `compact-every-num-of-files` -1
force-compact-file-age-seconds : 300

# For OBD_Compact 
# According to the number of sst files in rocksdb, 
# compact every `compact-every-num-of-files` file.
force-compact-min-delete-ratio : 10

# For OBD_Compact 
# According to the number of sst files in rocksdb, 
# compact every `compact-every-num-of-files` file.
dont-compact-sst-created-in-seconds : 20

# For OBD_Compact 
# According to the number of sst files in rocksdb, 
# compact every `compact-every-num-of-files` file.
best-delete-min-ratio : 40

测试命令和结果如下

redis-benchmark -h localhost -p 9221 -t set,get -n 10000000 -r 10000000000000 -d 512 -c 64 --threads 32

当前分支
====== SET ======
Summary:
  throughput summary: 41356.49 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        1.536     0.088     1.423     2.639     3.655    47.583
====== GET ======
Summary:
  throughput summary: 77099.22 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.819     0.040     0.751     1.519     2.359    39.071

pika unstable 分支
====== SET ======
Summary:
  throughput summary: 35311.73 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        1.799     0.064     1.671     3.191     4.223    41.215
====== GET ======
Summary:
  throughput summary: 46570.11 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        1.359     0.032     1.199     2.975     4.007    38.079

baixin01 · 2024-09-23T02:16:47Z

@cheniujh review

QlQlqiqi · 2024-09-23T02:36:11Z

该 pr 的 p99 问题结合 #2816 解决。

Issues-translate-bot · 2024-09-23T02:36:22Z

Bot detected the issue body's language is not English, translate it automatically.

This pr's p99 issue is resolved in conjunction with #2816.

change thread sheduling method and the logic is based on rocksdb

8342c96

github-actions bot added the Invalid PR Title label May 12, 2024

QlQlqiqi changed the title ~~change thread sheduling method in ThreadPool class~~ feat:change thread sheduling method in ThreadPool class May 12, 2024

github-actions bot removed the Invalid PR Title label May 12, 2024

AlexStocks reviewed May 12, 2024

View reviewed changes

src/net/include/random.h Outdated Show resolved Hide resolved

AlexStocks reviewed May 12, 2024

View reviewed changes

src/net/include/likely.h Outdated Show resolved Hide resolved

add Copyright and replace "#param once" with "#ifdef"

79ca6c6

github-actions bot added the ✏️ Feature New feature or request label May 12, 2024

QlQlqiqi requested a review from AlexStocks May 12, 2024 06:43

QlQlqiqi and others added 8 commits May 13, 2024 14:42

change the lisence Copyright start date

1bb167c

add comment for the order between unlock and consumption

f9a15cd

Merge branch 'OpenAtomFoundation:unstable' into unstable

ce80eee

fix bug

d89cd2e

add some tips for failing to start codis

3778461

Merge branch 'OpenAtomFoundation:unstable' into unstable

a583036

fix bug: addtional introduced packages maybe cause core dump when sta…

43bd1ab

…rting pika in centos

fix bug: failed to start redis server

14f59b3

QlQlqiqi force-pushed the unstable branch from 0b331ec to 14f59b3 Compare May 22, 2024 07:37

Merge branch 'OpenAtomFoundation:unstable' into unstable

a107816

cheniujh reviewed May 23, 2024

View reviewed changes

one worker thread, one list

807a1c4

Merge branch 'OpenAtomFoundation:unstable' into unstable

0d1b00f

coderabbitai bot reviewed Jun 7, 2024

View reviewed changes

Merge branch 'std' into change-thread-shedule

7020519

QlQlqiqi added 3 commits June 19, 2024 22:06

mutil threads per link and mutil links

c21fd6e

Merge branch 'change-thread-shedule-with-mutil-list-per-worker' into …

cedbbca

…change-thread-shedule

Merge branch 'std' into change-thread-shedule

9dab5c4

QlQlqiqi force-pushed the unstable branch from 0d1b00f to 9dab5c4 Compare June 26, 2024 08:58

coderabbitai bot reviewed Jun 26, 2024

View reviewed changes

QlQlqiqi mentioned this pull request Jun 27, 2024

fix: timer task is time out or enqueue queue_, should run task immediately #2761

Closed

fix: timer task maybe block worker thread

8694da0

coderabbitai bot reviewed Jun 27, 2024

View reviewed changes

AlexStocks added 3.5.5 4.0.1 labels Jun 28, 2024

QlQlqiqi added 2 commits July 13, 2024 11:44

change default queue_slow_size_ and nworkers_per_link_

6e6b808

Merge branch 'change-thread-shedule' into unstable

055a479

coderabbitai bot reviewed Jul 13, 2024

View reviewed changes

QlQlqiqi added 3 commits July 22, 2024 20:32

Merge branch 'std' into change-thread-shedule

3d9f002

remove timer_task and add some default value

ec24c39

Merge branch 'change-thread-shedule' into unstable

30e2a96

coderabbitai bot reviewed Jul 22, 2024

View reviewed changes

luky116 removed the 3.5.5 label Aug 7, 2024

Merge branch 'std' into unstable

8946098

QlQlqiqi mentioned this pull request Sep 23, 2024

feat: a new strategy for auto compaction #2816

Open

	# sugguest clear these before starting codis
	# suggest clear these before starting codis

		@@ -0,0 +1,88 @@
		// Copyright (c) 2024-present, Qihoo, Inc. All rights reserved.

-// Copyright (c) 2024-present, Qihoo, Inc.  All rights reserved.
+// Copyright (c) 2024-present, OpenAtom Foundation. All rights reserved.
+// This source code is licensed under the BSD-style license found in the
+// LICENSE file in the root directory of this source tree. An additional grant
+// of patent rights can be found in the PATENTS file in the same directory.
+// Copyright (c) 2024-present, Qihoo, Inc.  All rights reserved.

	explicit ThreadPool(size_t worker_num, size_t max_queue_size, std::string thread_pool_name = "ThreadPool");
	explicit ThreadPool(size_t worker_num, size_t max_queue_size, std::string thread_pool_name = "ThreadPool") : worker_num_(worker_num), max_queue_size_(max_queue_size), thread_pool_name_(std::move(thread_pool_name)), running_(false), should_stop_(false) {}

		Node* CreateMissingNewerLinks(Node* head, int* cnt);
		bool LinkOne(Node* node, std::atomic<Node> newest_node);

-    if (pthread_create(&thread_id_, nullptr, &WorkerMain, &arg_) != 0) {
+    if (int err = pthread_create(&thread_id_, nullptr, &WorkerMain, &arg_); err != 0) {
+        throw std::runtime_error("Failed to create thread: " + std::strerror(err));
+    }

feat:change thread sheduling method in ThreadPool class #2648

Are you sure you want to change the base?

feat:change thread sheduling method in ThreadPool class #2648

Conversation

QlQlqiqi commented May 12, 2024 • edited by coderabbitai bot Loading

The logic is based on function WriteThread::AwaitState in rocksdb. link

Before:

After:

params

Summary by CodeRabbit

cheniujh May 23, 2024

Choose a reason for hiding this comment

QlQlqiqi May 23, 2024

Choose a reason for hiding this comment

QlQlqiqi May 23, 2024

Choose a reason for hiding this comment

QlQlqiqi May 23, 2024

Choose a reason for hiding this comment

AlexStocks commented Jun 5, 2024

CodiumAI-Agent commented Jun 5, 2024

PR Review 🔍

AlexStocks commented Jun 5, 2024

CodiumAI-Agent commented Jun 5, 2024

PR Code Suggestions ✨

coderabbitai bot commented Jun 7, 2024 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 7, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 7, 2024

Choose a reason for hiding this comment

QlQlqiqi commented Jun 18, 2024 • edited Loading

关闭 binlog、cache

修改参数

或者可以通过牺牲 set 来换取 get 性能

QlQlqiqi commented Jun 18, 2024 • edited Loading

当前分支成为 v1，一个 worker 一个无锁链表成为 v2。

QlQlqiqi commented Jun 19, 2024 • edited Loading

QlQlqiqi commented Jun 26, 2024

测试总结：

Issues-translate-bot commented Jun 26, 2024

Test summary:

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 26, 2024

Choose a reason for hiding this comment

QlQlqiqi commented Jun 27, 2024 • edited Loading

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 27, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

QlQlqiqi commented Jul 13, 2024

结论

coderabbitai bot left a comment

Choose a reason for hiding this comment

QlQlqiqi commented May 12, 2024 •

edited by coderabbitai bot

Loading

The logic is based on function `WriteThread::AwaitState` in rocksdb. link

coderabbitai bot commented Jun 7, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

QlQlqiqi commented Jun 18, 2024 •

edited

Loading

QlQlqiqi commented Jun 18, 2024 •

edited

Loading

QlQlqiqi commented Jun 19, 2024 •

edited

Loading

QlQlqiqi commented Jun 27, 2024 •

edited

Loading