Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

charset: support collation utf8mb4_unicode_ci and utf8_unicode_ci #17596

Closed
bb7133 opened this issue Jun 2, 2020 · 0 comments
Closed

charset: support collation utf8mb4_unicode_ci and utf8_unicode_ci #17596

bb7133 opened this issue Jun 2, 2020 · 0 comments
Assignees
Labels
component/charset feature/accepted This feature request is accepted by product managers priority/P0 The issue has P0 priority. type/feature-request Categorizes issue or PR as related to a new feature.
Milestone

Comments

@bb7133
Copy link
Member

bb7133 commented Jun 2, 2020

Feature Request

Is your feature request related to a problem? Please describe:
Currently, TiDB doesn't support utf8mb4_unicode_ci and utf8_unicode_ci when new collation is enabled.

tidb> set names utf8 collate utf8_unicode_ci;
ERROR 1273 (HY000): Unsupported collation when new collation is enabled: 'utf8_unicode_ci'.

unicode_ci is a widely used collation in MySQL, it would be better if TiDB can support it.

Describe the feature you'd like:
Support collation utf8mb4_unicode_ci and utf8_unicode_ci when new collation is enabled.
Besides implementing the algorithm for unicode_ci, we need to think over how to incorporate it into the current new collation frame.
For example, what if concat(general_ci_str, unicode_ci_str)? How constant propagation work with unicode_ci?

Mentor(s)

Contact the mentors: #ddl-sig channel in TiDB Community Slack Workspace

Recommended Skills

  • Golang
  • Rust

Learning Materials

Schedule

  • GanttStart: 2020-07-01
  • GanttDue: 2020-10-15
  • GanttProgress: 100%
@bb7133 bb7133 added type/feature-request Categorizes issue or PR as related to a new feature. component/charset labels Jun 2, 2020
@bb7133 bb7133 added this to the 5.0-alpha milestone Jun 2, 2020
@scsldb scsldb added the priority/P0 The issue has P0 priority. label Jun 12, 2020
@scsldb scsldb modified the milestones: v5.0.0-alpha, v5.0.0-beta Jul 15, 2020
@scsldb scsldb added the feature/accepted This feature request is accepted by product managers label Jul 16, 2020
@wjhuang2016 wjhuang2016 removed their assignment Dec 10, 2020
@scsldb scsldb modified the milestones: v5.0.0-beta, v5.0.0-rc Dec 14, 2020
@jebter jebter modified the milestones: v5.0.0-rc, v5.0.0 Jan 18, 2021
ti-chi-bot pushed a commit to tikv/tikv that referenced this issue Jan 28, 2021
cherry-pick #8420 to release-4.0
You can switch your code base to this Pull Request by using [git-extras](https://github.com/tj/git-extras):
```bash
# In tikv repo:
git pr #9577
```

After apply modifications, you can push your change to this PR via:
```bash
git push [email protected]:ti-srebot/tikv.git pr/9577:release-4.0-f456abae9e5c
```

---

<!--
Thank you for contributing to TiKV!

If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document.

If you're unsure about anything, just ask; somebody should be along to answer within a day or two.

PR Title Format:
1. module [, module2, module3]: what's changed
2. *: what's changed
-->
Signed-off-by: jwxiong <[email protected]>

### What problem does this PR solve?

Problem Summary:

support utf8mb4_unicode_ci collation pingcap/tidb#17596

### What is changed and how it works?

add utf8mb4_unicode_ci support in tikv
for detail, please see pingcap/tidb#18776

### Check List <!--REMOVE the items that are not applicable-->

Tests <!-- At least one of them must be included. -->

- Unit test
- Integration test
- Manual test (add detailed scripts or steps below)


### Release note <!-- bugfixes or new feature need a release note -->
- add utf8mb4_unicode_ci implement
gengliqi pushed a commit to gengliqi/tikv that referenced this issue Feb 20, 2021
cherry-pick tikv#8420 to release-4.0
You can switch your code base to this Pull Request by using [git-extras](https://github.com/tj/git-extras):
```bash
# In tikv repo:
git pr tikv#9577
```

After apply modifications, you can push your change to this PR via:
```bash
git push [email protected]:ti-srebot/tikv.git pr/9577:release-4.0-f456abae9e5c
```

---

<!--
Thank you for contributing to TiKV!

If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document.

If you're unsure about anything, just ask; somebody should be along to answer within a day or two.

PR Title Format:
1. module [, module2, module3]: what's changed
2. *: what's changed
-->
Signed-off-by: jwxiong <[email protected]>

### What problem does this PR solve?

Problem Summary:

support utf8mb4_unicode_ci collation pingcap/tidb#17596

### What is changed and how it works?

add utf8mb4_unicode_ci support in tikv
for detail, please see pingcap/tidb#18776

### Check List <!--REMOVE the items that are not applicable-->

Tests <!-- At least one of them must be included. -->

- Unit test
- Integration test
- Manual test (add detailed scripts or steps below)


### Release note <!-- bugfixes or new feature need a release note -->
- add utf8mb4_unicode_ci implement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/charset feature/accepted This feature request is accepted by product managers priority/P0 The issue has P0 priority. type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants