Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Cross-Region deployment features #18273

Closed
6 tasks done
shenli opened this issue Jun 29, 2020 · 2 comments
Closed
6 tasks done

Improved Cross-Region deployment features #18273

shenli opened this issue Jun 29, 2020 · 2 comments
Assignees
Labels
feature/accepted This feature request is accepted by product managers type/feature-request Categorizes issue or PR as related to a new feature.
Milestone

Comments

@shenli
Copy link
Member

shenli commented Jun 29, 2020

Description

Deploy a cluster across multiple geo regions. Schedule the data to the AZ/region near the workload, like Google Spanner.

We need to reduce query latency by reducing cross-region data access. The applications deployed in a certain region may only read/write the data related to that region. So for better performance, the basic idea is to put the related data near the application (data locality). Placement Rules is a powerful tool provided by PD. We can use it to place all(or at least the majority) replicas of a table/partition in a specific region. This could mitigate the cross-region network round-trip for both read and write operations.

Besides the data access, another thing we should consider is the transaction timestamp allocation (TSO). For now, it's allocated by the PD leader only. So for the transactions that occur not in the same region with the PD leader, there is still one cross-region round-trip to get the transaction timestamp. We need to figure out a way to allocate transaction timestamps from PD followers. In this way, we can mitigate the cross-region network round-trip for TSO allocation.

Task List

Note: Each of the following big tasks is composed of lots of small tasks, please click on the task URL to get more detail of each big task.

Technical Design Docs

Progress Tracking

  • Define the placement of data by SQL statements("Geo Partition") https://github.com/pingcap/tidb/projects/49
  • Weekly Report: Cross-Region Deployment & GEO Partition, which contains the progress of the whole project:
    • Config Placement Rules by SQL
    • Support Regional and Global TSO Allocator in Cross-Region Deployment
    • Support Regional and Global Transaction in Cross-Region Deployment
    • Read Local Replica to Reduce Latency in Cross-Region Deployment

Join Us

@shenli shenli added the type/feature-request Categorizes issue or PR as related to a new feature. label Jun 29, 2020
@shenli shenli added this to the v5.0-alpha.1 milestone Jun 29, 2020
@zhangjinpeng87 zhangjinpeng87 added the priority/P0 The issue has P0 priority. label Jun 30, 2020
@scsldb scsldb modified the milestones: v5.0-alpha.1, v5.0-alpha Jun 30, 2020
@zz-jason zz-jason changed the title Support cross region deployment and geo-partition Support Cross-Region Deployment & Geo-Partition Jul 11, 2020
@zz-jason
Copy link
Member

Some questions about Geo-Partition:

  • We planned to support geo-partition on the partition table. But at present, TiDB only supports hash and range partition. Is it enough for applications to map data into the different partitions and further map specific partitions to different Region/Availability Zone(AZ)?

  • If the applications require to join two tables, one is the geo-partitioned table, another is a normal table. Or if the applications just need to query another non-partition table. In these scenarios, We may need to figure out a way to replicate the normal table and place each replication to each Region/AZ to save the query latency.

@zz-jason
Copy link
Member

In order to improve the query performance on a Region or AZ, we'd better improve the partition pruning algorithm to prune un-related partition as many as possible. IMO #17474 and #18016 are also important in cross-region workloads.

@scsldb scsldb modified the milestones: v5.0.0-alpha, v5.0.0-beta.1 Jul 15, 2020
@scsldb scsldb added the feature/accepted This feature request is accepted by product managers label Jul 16, 2020
@zz-jason zz-jason added priority/critical-urgent and removed priority/P0 The issue has P0 priority. labels Oct 31, 2020
@scsldb scsldb modified the milestones: v5.0.0-beta.1, v5.0.0 Dec 20, 2020
@scsldb scsldb changed the title Support Cross-Region Deployment & Geo-Partition Improved Cross-Region deployment features Jan 11, 2021
@nolouch nolouch closed this as completed Dec 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/accepted This feature request is accepted by product managers type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

5 participants