Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coprocessor request concurrency for partitioned table #55190

Closed
tiancaiamao opened this issue Aug 5, 2024 · 3 comments
Closed

coprocessor request concurrency for partitioned table #55190

tiancaiamao opened this issue Aug 5, 2024 · 3 comments
Labels
type/bug The issue is confirmed as a bug.

Comments

@tiancaiamao
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

create table pt (id int primary key auto_increment, val int) partition by range (id)
(PARTITION p1 VALUES LESS THAN (100),
PARTITION p2 VALUES LESS THAN (200),
PARTITION p3 VALUES LESS THAN (300),
PARTITION p4 VALUES LESS THAN (400),
PARTITION p5 VALUES LESS THAN (500),
PARTITION p6 VALUES LESS THAN (600),
PARTITION p7 VALUES LESS THAN (700));

insert into pt (val) values (123),(456),(789),(1112);
insert into pt (val) select (val) from pt;
insert into pt (val) select (val) from pt;
insert into pt (val) select (val) from pt;
insert into pt (val) select (val) from pt;
insert into pt (val) select (val) from pt;
insert into pt (val) select (val) from pt;
split table pt between (0) and (40960) regions 30;
analyze table pt;

2. What did you expect to see? (Required)

explain analyze select * from pt order by id limit 100;

mysql> explain analyze select * from pt order by id limit 100;
+----------------------------+---------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------+------+
| id                         | estRows | actRows | task      | access object | execution info                                                                                                                                                                                                                                                                                                                                                                                                                 | operator info       | memory  | disk |
+----------------------------+---------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------+------+
| Limit_10                   | 100.00  | 100     | root      |               | time:45.7ms, loops:2, RU:88.670214                                                                                                                                                                                                                                                                                                                                                                                             | offset:0, count:100 | N/A     | N/A  |
| └─TableReader_17           | 100.00  | 100     | root      | partition:all | time:45.6ms, loops:1, cop_task: {num: 181, max: 788.3µs, min: 181.4µs, avg: 281.9µs, p95: 586.8µs, max_proc_keys: 100, p95_proc_keys: 0, tot_proc: 6.17ms, tot_wait: 9.63ms, copr_cache_hit_ratio: 0.00, build_task_duration: 225.8µs, max_distsql_concurrency: 1}, rpc_info:{Cop:{num_rpc:181, total_time:49.2ms}}                                                                                                            | data:Limit_16       | 11.4 KB | N/A  |
|   └─Limit_16               | 100.00  | 256     | cop[tikv] |               | tikv_task:{proc max:4ms, min:0s, avg: 22.1µs, p80:0s, p95:0s, iters:186, tasks:181}, scan_detail: {total_process_keys: 256, total_process_keys_size: 9664, total_keys: 437, get_snapshot_time: 7.08ms, rocksdb: {delete_skipped_count: 192, key_skipped_count: 448, block: {cache_hit_count: 362}}}, time_detail: {total_process_time: 6.17ms, total_wait_time: 9.63ms, total_kv_read_wall_time: 4ms, tikv_wall_time: 26.3ms}  | offset:0, count:100 | N/A     | N/A  |
|     └─TableFullScan_15     | 100.00  | 256     | cop[tikv] | table:pt      | tikv_task:{proc max:4ms, min:0s, avg: 22.1µs, p80:0s, p95:0s, iters:186, tasks:181}                                                                                                                                                                                                                                                                                                                                            | keep order:true     | N/A     | N/A  |
+----------------------------+---------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------+------+
4 rows in set (0.05 sec)

I expect to see max_distsql_concurrency: 7

3. What did you see instead (Required)

max_distsql_concurrency: 1

4. What is your TiDB version? (Required)

master 52b4c8a

@tiancaiamao tiancaiamao added the type/bug The issue is confirmed as a bug. label Aug 5, 2024
@tiancaiamao
Copy link
Contributor Author

'order by' and no 'order by' query use different code branch.
explain analyze select * from pt limit 100; work as expected.
It use max_distsql_concurrency: 7

When building 'order by', we call buildKVReqSeparately to handle key ranges one by one.
So the ranges are small and only involves one region, so the concurrency is set to 1 mistakenly.

goroutine 1983 [running]:
runtime/debug.Stack()
        /home/genius/project/go/src/runtime/debug/stack.go:24 +0x5e
runtime/debug.PrintStack()
        /home/genius/project/go/src/runtime/debug/stack.go:16 +0x13
github.com/pingcap/tidb/pkg/distsql.(*RequestBuilder).SetDAGRequest(0xc00a9c21a0, 0xc00aa2e680?)
        /home/genius/project/src/github.com/pingcap/tidb/pkg/distsql/request_builder.go:171 +0x126
github.com/pingcap/tidb/pkg/executor.(*TableReaderExecutor).buildKVReqSeparately(0xc00a68b680, {0x6c9cdc8, 0xc00aa961e0}, {0xc00a520ff8?, 0xc00aa2aaa0?, 0x9?})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/table_reader.go:437 +0x392
github.com/pingcap/tidb/pkg/executor.(*TableReaderExecutor).buildResp(0xc00a68b680, {0x6c9cdc8, 0xc00aa961e0}, {0xc00a520ff8?, 0xa11c1e0?, 0x1e5cbfa?})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/table_reader.go:389 +0x15d
github.com/pingcap/tidb/pkg/executor.(*TableReaderExecutor).Open(0xc00a68b680, {0x6c9cdc8?, 0xc00aa961e0?})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/table_reader.go:297 +0x597
github.com/pingcap/tidb/pkg/executor/internal/exec.Open({0x6c9cdc8?, 0xc00aa961e0?}, {0x6ccc310?, 0xc00a68b680?})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/internal/exec/executor.go:428 +0x72
github.com/pingcap/tidb/pkg/executor/internal/exec.(*BaseExecutorV2).Open(0x0?, {0x6c9cdc8, 0xc00aa961e0})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/internal/exec/executor.go:300 +0x6d
github.com/pingcap/tidb/pkg/executor.(*LimitExec).Open(0xc008e15a40, {0x6c9cdc8, 0xc00aa961e0})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/executor.go:1395 +0x2f
github.com/pingcap/tidb/pkg/executor/internal/exec.Open({0x6c9cdc8?, 0xc00aa961e0?}, {0x6cce950?, 0xc008e15a40?})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/internal/exec/executor.go:428 +0x72
github.com/pingcap/tidb/pkg/executor.(*ExplainExec).Open(0xc003ad37ea?, {0x6c9cdc8?, 0xc00aa961e0?})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/explain.go:57 +0x46
github.com/pingcap/tidb/pkg/executor/internal/exec.Open({0x6c9cdc8?, 0xc00aa961e0?}, {0x6ccce50?, 0xc00aa74b40?})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/internal/exec/executor.go:428 +0x72
github.com/pingcap/tidb/pkg/executor.(*ExecStmt).openExecutor(0xc00a3a9ef0, {0x6c9cdc8, 0xc00aa961e0}, {0x6ccce50, 0xc00aa74b40})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/adapter.go:1243 +0xad
github.com/pingcap/tidb/pkg/executor.(*ExecStmt).Exec(0xc00a3a9ef0, {0x6c9cdc8, 0xc00aa961e0})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/executor/adapter.go:580 +0xa0f
github.com/pingcap/tidb/pkg/session.runStmt({0x6c9cdc8?, 0xc00a5eb6e0?}, 0xc0037f1180, {0x6cac3e0, 0xc00a3a9ef0?})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/session/session.go:2288 +0x31e
github.com/pingcap/tidb/pkg/session.(*session).ExecuteStmt(0xc0037f1180, {0x6c9cdc8?, 0xc00a5eb6e0?}, {0x6cb4a00, 0xc00a97a0e0?})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/session/session.go:2149 +0x110a
github.com/pingcap/tidb/pkg/server.(*TiDBContext).ExecuteStmt(0xc001ea9dd0, {0x6c9cdc8, 0xc00a5eb6e0}, {0x6cb4a00?, 0xc00a97a0e0})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/server/driver_tidb.go:291 +0xa7
github.com/pingcap/tidb/pkg/server.(*clientConn).handleStmt(0xc002bb7ba0, {0x6c9ce00, 0xc00aa2a280}, {0x6cb4a00?, 0xc00a97a0e0}, {0x0, 0x0, 0x0}, 0x1)
        /home/genius/project/src/github.com/pingcap/tidb/pkg/server/conn.go:2047 +0x2d2
github.com/pingcap/tidb/pkg/server.(*clientConn).handleQuery(0xc002bb7ba0, {0x6c9ce00, 0xc00aa2a280}, {0xc00a611181, 0x36})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/server/conn.go:1801 +0xb5e
github.com/pingcap/tidb/pkg/server.(*clientConn).dispatch(0xc002bb7ba0, {0x6c9cdc8, 0xc004029620}, {0xc00a611180, 0x37, 0x37})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/server/conn.go:1375 +0xf2b
github.com/pingcap/tidb/pkg/server.(*clientConn).Run(0xc002bb7ba0, {0x6c9cdc8, 0xc004029620})
        /home/genius/project/src/github.com/pingcap/tidb/pkg/server/conn.go:1141 +0x545
github.com/pingcap/tidb/pkg/server.(*Server).onConn(0xc003812800?, 0xc002bb7ba0)
        /home/genius/project/src/github.com/pingcap/tidb/pkg/server/server.go:741 +0x83d
created by github.com/pingcap/tidb/pkg/server.(*Server).startNetworkListener in goroutine 728
        /home/genius/project/src/github.com/pingcap/tidb/pkg/server/server.go:560 +0x659
set concurrency in builder SetDAGRequest === 1 partition Num == 1

@tiancaiamao
Copy link
Contributor Author

PTAL @Defined2014

@Defined2014
Copy link
Contributor

Defined2014 commented Aug 5, 2024

We use multiple selectResult when keepOrder=true. The max_distsql_concurrency in one selectResult will equals to 1. But it is parallel between different selectResults. This will cause the risk of OOM failure when there are many partitions in one table, but keep it this way for now.

From the trace info, we can find the regionRequest.SendReqCtx function is parallel.

mysql> trace select * from pt order by id;
+----------------------------------------------+-----------------+------------+
| operation                                    | startTS         | duration   |
+----------------------------------------------+-----------------+------------+
| trace                                        | 18:42:22.506963 | 1.9785ms   |
|   ├─session.ExecuteStmt                      | 18:42:22.506971 | 1.144875ms |
|   │ ├─executor.Compile                       | 18:42:22.506991 | 624.958µs  |
|   │ │ ├─planner.Preprocess                   | 18:42:22.506994 | 8.166µs    |
|   │ │ └─planner.Optimize                     | 18:42:22.507251 | 345.417µs  |
|   │ └─session.runStmt                        | 18:42:22.507627 | 450.583µs  |
|   │   └─TableReaderExecutor.Open             | 18:42:22.507810 | 259.5µs    |
|   │     ├─distsql.Select                     | 18:42:22.507864 | 47.291µs   |
|   │     │ ├─copr.buildCopTasks               | 18:42:22.507870 | 11.791µs   |
|   │     │ └─regionRequest.SendReqCtx         | 18:42:22.508013 | 650.875µs  |
|   │     ├─distsql.Select                     | 18:42:22.507919 | 28.458µs   |
|   │     │ ├─copr.buildCopTasks               | 18:42:22.507924 | 3.625µs    |
|   │     │ └─regionRequest.SendReqCtx         | 18:42:22.508052 | 544.167µs  |
|   │     ├─distsql.Select                     | 18:42:22.507955 | 16.417µs   |
|   │     │ ├─copr.buildCopTasks               | 18:42:22.507958 | 5.292µs    |
|   │     │ └─regionRequest.SendReqCtx         | 18:42:22.508103 | 501.334µs  |
|   │     ├─distsql.Select                     | 18:42:22.507980 | 13.334µs   |
|   │     │ ├─copr.buildCopTasks               | 18:42:22.507982 | 3.042µs    |
|   │     │ └─regionRequest.SendReqCtx         | 18:42:22.508546 | 168.25µs   |
|   │     ├─distsql.Select                     | 18:42:22.507998 | 13.667µs   |
|   │     │ ├─copr.buildCopTasks               | 18:42:22.508001 | 2.916µs    |
|   │     │ └─regionRequest.SendReqCtx         | 18:42:22.508239 | 521.584µs  |
|   │     ├─distsql.Select                     | 18:42:22.508018 | 14.834µs   |
|   │     │ ├─copr.buildCopTasks               | 18:42:22.508021 | 3.125µs    |
|   │     │ └─regionRequest.SendReqCtx         | 18:42:22.508180 | 417.333µs  |
|   │     └─distsql.Select                     | 18:42:22.508039 | 20.75µs    |
|   │       ├─copr.buildCopTasks               | 18:42:22.508048 | 3.125µs    |
|   │       └─regionRequest.SendReqCtx         | 18:42:22.508188 | 450.625µs  |
|   ├─*executor.TableReaderExecutor.Next       | 18:42:22.508127 | 735.5µs    |
|   └─*executor.TableReaderExecutor.Next       | 18:42:22.508882 | 10.5µs     |
+----------------------------------------------+-----------------+------------+
30 rows in set (0.01 sec)

@Defined2014 Defined2014 closed this as not planned Won't fix, can't repro, duplicate, stale Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

2 participants