-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the Performance of Bulk Deletion #18028
Comments
The original scenario should be
|
At first, we can test how about the performance when enabling coprocessor cache. |
#18308 may be a common solution for this issue. We can also enable cop cache |
@SunRunAway PTAL. If we rely on coprocessor cache to speed up this scenario, we should make coprocessor cache generally available at least in 5.0.0. BTW, we can also brainstorm whether there are other methods to speed up this scenario. |
We need to construct a bulk delete scenario to reproduce the performance issue. |
Another solution is to separate all the data into "pages", calculate the page boundaries, delete the data "page" by "page": create a table with create table t(a bigint, b bigint) shard_row_id_bits = 4;
insert into t values(1, 1);
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t; separate all the data into "pages", calculate the page bounds, in the following example, each "page" contains set @batch_size = 5;
select
min(tmp._tidb_rowid) as start_rowid,
max(tmp._tidb_rowid) as end_rowid,
count(*) as page_size
from (
select
*,
_tidb_rowid as _tidb_rowid,
row_number () over (order by t._tidb_rowid) as row_num
from t
) tmp
group by floor((tmp.row_num - 1) / @batch_size)
order by start_rowid; the query result of the above SQL is:
delete them page by pages: delete from t where _tidb_rowid >= 5 and _tidb_rowid <= 576460752303423505;
delete from t where _tidb_rowid >= 576460752303423506 and _tidb_rowid <= 576460752303423510;
delete from t where _tidb_rowid >= 576460752303423511 and _tidb_rowid <= 576460752303423515;
delete from t where _tidb_rowid >= 576460752303423516 and _tidb_rowid <= 576460752303423520;
delete from t where _tidb_rowid >= 1152921504606846979 and _tidb_rowid <= 2882303761517117475;
delete from t where _tidb_rowid >= 2882303761517117476 and _tidb_rowid <= 2882303761517117480;
delete from t where _tidb_rowid >= 2882303761517117481 and _tidb_rowid <= 2882303761517117485;
delete from t where _tidb_rowid >= 2882303761517117486 and _tidb_rowid <= 2882303761517117490;
delete from t where _tidb_rowid >= 2882303761517117491 and _tidb_rowid <= 2882303761517117495;
delete from t where _tidb_rowid >= 2882303761517117496 and _tidb_rowid <= 2882303761517117500;
delete from t where _tidb_rowid >= 2882303761517117501 and _tidb_rowid <= 2882303761517117505;
delete from t where _tidb_rowid >= 2882303761517117506 and _tidb_rowid <= 2882303761517117510;
delete from t where _tidb_rowid >= 2882303761517117511 and _tidb_rowid <= 2882303761517117515;
delete from t where _tidb_rowid >= 2882303761517117516 and _tidb_rowid <= 2882303761517117520;
delete from t where _tidb_rowid >= 2882303761517117521 and _tidb_rowid <= 2882303761517117525;
delete from t where _tidb_rowid >= 2882303761517117526 and _tidb_rowid <= 2882303761517117530;
delete from t where _tidb_rowid >= 2882303761517117531 and _tidb_rowid <= 2882303761517117535;
delete from t where _tidb_rowid >= 2882303761517117536 and _tidb_rowid <= 2882303761517117540;
delete from t where _tidb_rowid >= 2882303761517117541 and _tidb_rowid <= 2882303761517117545;
delete from t where _tidb_rowid >= 2882303761517117546 and _tidb_rowid <= 2882303761517117550;
delete from t where _tidb_rowid >= 2882303761517117551 and _tidb_rowid <= 2882303761517117555;
delete from t where _tidb_rowid >= 2882303761517117556 and _tidb_rowid <= 2882303761517117560;
delete from t where _tidb_rowid >= 2882303761517117561 and _tidb_rowid <= 2882303761517117565;
delete from t where _tidb_rowid >= 2882303761517117566 and _tidb_rowid <= 4611686018427387913;
delete from t where _tidb_rowid >= 4611686018427387914 and _tidb_rowid <= 4611686018427387918;
delete from t where _tidb_rowid >= 4611686018427387919 and _tidb_rowid <= 5764607523034235011;
delete from t where _tidb_rowid >= 5764607523034235012 and _tidb_rowid <= 5764607523034235016;
delete from t where _tidb_rowid >= 5764607523034235017 and _tidb_rowid <= 5764607523034235021;
delete from t where _tidb_rowid >= 5764607523034235022 and _tidb_rowid <= 5764607523034235026;
delete from t where _tidb_rowid >= 5764607523034235027 and _tidb_rowid <= 5764607523034235031;
delete from t where _tidb_rowid >= 5764607523034235032 and _tidb_rowid <= 5764607523034235036;
delete from t where _tidb_rowid >= 5764607523034235037 and _tidb_rowid <= 5764607523034235041;
delete from t where _tidb_rowid >= 5764607523034235042 and _tidb_rowid <= 5764607523034235046;
delete from t where _tidb_rowid >= 5764607523034235047 and _tidb_rowid <= 5764607523034235051;
delete from t where _tidb_rowid >= 5764607523034235052 and _tidb_rowid <= 5764607523034235056;
delete from t where _tidb_rowid >= 5764607523034235057 and _tidb_rowid <= 5764607523034235061;
delete from t where _tidb_rowid >= 5764607523034235062 and _tidb_rowid <= 5764607523034235066;
delete from t where _tidb_rowid >= 5764607523034235067 and _tidb_rowid <= 5764607523034235071;
delete from t where _tidb_rowid >= 5764607523034235072 and _tidb_rowid <= 5764607523034235076;
delete from t where _tidb_rowid >= 5764607523034235077 and _tidb_rowid <= 5764607523034235081;
delete from t where _tidb_rowid >= 5764607523034235082 and _tidb_rowid <= 5764607523034235086;
delete from t where _tidb_rowid >= 5764607523034235087 and _tidb_rowid <= 5764607523034235091;
delete from t where _tidb_rowid >= 5764607523034235092 and _tidb_rowid <= 5764607523034235096;
delete from t where _tidb_rowid >= 5764607523034235097 and _tidb_rowid <= 5764607523034235101;
delete from t where _tidb_rowid >= 5764607523034235102 and _tidb_rowid <= 5764607523034235106;
delete from t where _tidb_rowid >= 5764607523034235107 and _tidb_rowid <= 5764607523034235111;
delete from t where _tidb_rowid >= 5764607523034235112 and _tidb_rowid <= 5764607523034235116;
delete from t where _tidb_rowid >= 5764607523034235117 and _tidb_rowid <= 5764607523034235121;
delete from t where _tidb_rowid >= 5764607523034235122 and _tidb_rowid <= 5764607523034235126;
delete from t where _tidb_rowid >= 5764607523034235127 and _tidb_rowid <= 5764607523034235131;
delete from t where _tidb_rowid >= 5764607523034235132 and _tidb_rowid <= 5764607523034235136;
delete from t where _tidb_rowid >= 7493989779944505346 and _tidb_rowid <= 7493989779944505346; with the filters on TiDB(root@127.0.0.1:test) > explain delete from t where _tidb_rowid >= 5 and _tidb_rowid <= 576460752303423505;
+--------------------------+---------+-----------+---------------+--------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+--------------------------+---------+-----------+---------------+--------------------------------------------------------------+
| Delete_4 | N/A | root | | N/A |
| └─TableReader_7 | 256.00 | root | | data:TableRangeScan_6 |
| └─TableRangeScan_6 | 256.00 | cop[tikv] | table:t | range:[5,576460752303423505], keep order:false, stats:pseudo |
+--------------------------+---------+-----------+---------------+--------------------------------------------------------------+
3 rows in set (0.00 sec) We can also delete the "pages" parallel if it's required to execute faster. |
This comment has been minimized.
This comment has been minimized.
I have test continuous delete with coprocessor cache or without coprocessor cache. The environment is:
The script is: #!/usr/bin/env python
#encoding=utf-8
import mysql.connector
conn = mysql.connector.connect(host='172.16.5.85', port=8092, user='root', password='', database='tpch_10')
for i in range(2000):
cursor = conn.cursor()
cursor.execute("delete from lineitem limit 10000")
print(cursor.rowcount, i)
conn.commit()
cursor.close()
print("finished") By monitor of Grafanaz, the result is: When I turn on the coprocessor cache. P99 is about 250ms, p999 is about 500ms and the jitter is very stable. When turn off the coprocessor cache. P99 is about 1s, p999 is about 2s. By the way, the script got an error |
For now, we could turn on the copr-cache to solve the problem by temp. If we need some improvement in the future. I will open the issue again. |
@lzmhhh123 could you sumarize the time to delete all the data in the table with or without coprocessor cache? |
Could you also summarize the following questions:
|
@lzmhhh123 could you sumarize the time to delete all the data in the table with or without coprocessor cache?
|
Could you add more tests to help users know more about the deletion speed on different data sizes and the batch size for each delete transaction?
Finally, we can add a new best practice document about the bulk deletion.
It's better to add some metrics to observe the exact number. BTW, Do we have metrics about the memory consumption of coprocessor cache?
It's important for the users to know more about what scenario can benefit from coprocessor cache and what scenarios can not. Could you summarize these scenarios with some examples and experiments? |
@zz-jason Metrics is added, PR is under review. For best practice, I will do more experiments and publish an essay for it. |
Description
Optimize the performance and efficiency of bulk deletion, a typical query is:
Category
Feature, Stability
Value
Implement a special SQL using offset in session to improve performance
delete * from table where t < xxx (backgroud & no-automic)
.Time
GanttStart: 2020-07-31
GanttDue: 2020-08-15
The text was updated successfully, but these errors were encountered: