-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize comparision for collation UTF8_BIN
and UTF8MB4_BIN
#5299
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/run-all-tests |
/run-all-tests |
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
Can you also add some benchmark result for TiDB cluster with new collation disabled? |
// Remove tail space | ||
__attribute__((flatten, always_inline, pure)) inline std::string_view RightTrim(const std::string_view & v) | ||
{ | ||
size_t end = v.find_last_not_of(' '); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if other space characters need to be handled here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to MySQL::Doc::string-comparison-functions.
- In particular, trailing spaces are always significant. This differs from comparisons performed with the = operator, for which the significance of trailing spaces in nonbinary strings (CHAR, VARCHAR, and TEXT values) depends on the pad attribute of the collation used for the comparison. For more information, see Trailing Space Handling in Comparisons.
mysql> SET NAMES utf8mb4 COLLATE utf8mb4_bin;
mysql> SELECT 'a ' = 'a';
+------------+
| 'a ' = 'a' |
+------------+
| 1 |
+------------+
mysql> SET NAMES utf8mb4 COLLATE utf8mb4_0900_bin;
mysql> SELECT 'a ' = 'a';
+------------+
| 'a ' = 'a' |
+------------+
| 0 |
+------------+
/run-all-tests |
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: ruoxi <[email protected]>
Co-authored-by: ruoxi <[email protected]>
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
/merge |
@solotzg: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
This pull request has been accepted and is ready to merge. Commit hash: 6a1f4dd
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/unhold |
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
@solotzg: Your PR was out of date, I have automatically updated it for you. At the same time I will also trigger all tests for you: /run-all-tests If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
…s in README (pingcap#5182) close pingcap#5172, ref pingcap#5178 Enhancement: add a integrated test on DDL module (pingcap#5130) ref pingcap#5129 Revert "Revise default background threads size" (pingcap#5176) close pingcap#5177 chore: remove extra dyn cast (pingcap#5186) close pingcap#5185 Add MPPReceiverSet, which includes ExchangeReceiver and CoprocessorReader (pingcap#5175) ref pingcap#5095 DDL: Use Column Name Instead of Offset to Find the common handle cluster index (pingcap#5166) close pingcap#5154 Add random failpoint in critical paths (pingcap#4876) close pingcap#4807 Segment test framework (pingcap#5150) close pingcap#5151 optimize ps v3 restore (pingcap#5163) ref pingcap#4914 Fix build failed (pingcap#5196) close pingcap#5195 feat: delta tree dispatching (pingcap#5199) close pingcap#5200 feat: introduce specialized API to write fixed length data rapidly (pingcap#5181) close pingcap#5183 Add gtest for Limit, TopN, Projection (pingcap#5187) (pingcap#5188) close pingcap#5187 add `MPPTask::handleError()` (pingcap#5202) ref pingcap#5095 Check result of starting grpc server (pingcap#5257) close pingcap#5255 feat: add optimized routines for aarch64 (pingcap#5231) close pingcap#5240 fix: aarch64-quick-fix (pingcap#5259) close pingcap#5260 Update client-c to support ipv6 (pingcap#5270) close pingcap#5247 upgrade prometheus-cpp to v1.0.1 (pingcap#5279) ref pingcap#2103, close pingcap#5278 Fix README type error (pingcap#5273) ref pingcap#5178 fix(cmake): make sure libc++ is utilized by tiflash-proxy (pingcap#5281) close pingcap#5282 fix the wrong order of execution summary for list based executors (pingcap#5242) close pingcap#5241 Schema: allow loading empty schema diff when the version grows up. (pingcap#5245) close pingcap#5244 Optimize apply speed under heavy write pressure (pingcap#4883) ref pingcap#4728 update proxy to raftstore-proxy-6.2 (pingcap#5287) ref pingcap#4982 Flush segment cache when doing the compaction (pingcap#5284) close pingcap#5179 metrics: Fix incorrect metrics for delta_merge tasks (pingcap#5061) close pingcap#5055 dep: upgrade jemalloc (pingcap#5197) close pingcap#5258 *: TiFlash pagectl/dttool use only-decryption mode (pingcap#5271) close pingcap#5122 suppresion false positive report from tsan (pingcap#5303) close pingcap#5088 Refine test framework code and tests (pingcap#5261) close pingcap#5262 feat: add logical cpu cores and memory into grafana (pingcap#5124) close pingcap#3821 Implement TimeToSec function push down (pingcap#5235) close pingcap#5116 feat: implement shiftRight function push down (pingcap#5156) close pingcap#5100 schema : make update to partition tables when 'set tiflash replica' (pingcap#5267) close pingcap#5266 Replace initializer_list with vector for planner test framework (pingcap#5307) close pingcap#5295 KVStore: decouple flush region and CompactLog with a new FFI fn_try_flush_data (pingcap#5283) ref pingcap#5170 refine error message in mpptask (pingcap#5304) ref pingcap#5095 Implement ReverseUTF8/Reverse function push down (pingcap#5233) close pingcap#5111 Optimize comparision for collation `UTF8_BIN` and `UTF8MB4_BIN` (pingcap#5299) ref pingcap#5294 feat : support set tiflash mode ddl action (pingcap#5256) ref pingcap#5252 Add non-blocking functions for MPMCQueue (pingcap#5311) close pingcap#5310 add random segment test for CI weekly (pingcap#5300) close pingcap#5301 *: tidy FunctionString.cpp (pingcap#5312) close pingcap#5313 ci: fix check-license github action (pingcap#5318) close pingcap#5317 update proxy to raftstore-proxy-6.2 (pingcap#5316) ref pingcap#4982 Change one `additional_input_at_end` to many streams in `ParallelInputsProcessor` (pingcap#5274) close pingcap#4856, close pingcap#5263 support fine grained shuffle for window function (pingcap#5048) close pingcap#5142 feat: pushdown get_format into TiFlash (pingcap#5269) close pingcap#5115 fix: format throw data truncated error (pingcap#5272) close pingcap#4891 Print content of columns for gtest (pingcap#5243) close pingcap#5203 *: also enable O3 for aarch64 (pingcap#5338) close pingcap#5342 Add debug image build target for CentOS7 (pingcap#5344) close pingcap#5343 *: mini refactor (pingcap#5326) close pingcap#4739 Refactor initialize of background pool (pingcap#5190) close pingcap#5189 delete copy/move ctor of MPMCQueue explicitly (pingcap#5328) close pingcap#5329 Introduce proxy_server and new-mock-engine-store (pingcap#5319) ref pingcap#5170 fix: incorrect uptime in grafana panel Signed-off-by: Lloyd-Pottiger <[email protected]>
…N` (pingcap#5299)" This reverts commit 97342db.
What problem does this PR solve?
Issue Number: ref #5294
Problem Summary:
What is changed and how it works?
UTF8_BIN
andUTF8MB4_BIN
Benchmark
Env:
SQL
SQL
SQL
Analyze
When only checking equality,
tiflash/libs/libcommon/include/common/mem_utils.h
Line 138 in eaf1a4c
std::memcmp
if most parts of source and target string are equal.Check List
Tests
Side effects
Documentation
Release note