Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Schema Engine's TablesWithSize80 query #17066

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

mattlord
Copy link
Contributor

@mattlord mattlord commented Oct 25, 2024

Description

With only the query change the manual test in #17059 passes. The reason that this impacts the OnlineDDL cutover is because:

  1. We have thousands of tables
  2. This made our I_S query that the tabletserver's Schema Engine uses to maintain a cache of the table schemas very slow
  3. The DDL that we do is done via the ApplySchema RPC, which tells the schema engine to reload its cache at the end, which uses the noted slow query in step 2. This causes the schema engine's cache to be reloaded while holding the mutex.
  4. OnlineDDL creates a sentry table (possibly leading to another reload) just before waiting for VReplication to catch up
  5. VReplication tries to process the next GTID and get the new sentry table’s columns from the Schema Engine cache when processing the TABLE_MAP event in order to build and cache the table plan, and blocks on eventually reading from the cache because the mutex is held in step 3
  6. VReplication cannot get the columns for the table in order to map the row event to the table's columns before the 10 second (by default) timeout that OnlineDDL uses when waiting for VReplication to catch up (default)

So because the modified query is ~ 260x faster than the original — with the original taking about 13000ms and the new one about 50ms (see below) — we are able to update the schema cache and read the current tables schema for vreplication within the relatively default short timeout that OnlineDDL uses when waiting for vreplication to reach the position where the primary was.

Using the manual test from the issue again, we can compare the query times:

command mysql -u root --socket ${VTDATAROOT}/vt_0000000$(vtctldclient GetTablets --keyspace commerce --tablet-type primary --shard "0" | awk '{print $1}' | cut -d- -f2 | bc)/mysql.sock vt_commerce 

mysql> SELECT t.table_name, t.table_type, UNIX_TIMESTAMP(t.create_time), t.table_comment, SUM(i.file_size), SUM(i.allocated_size) FROM information_schema.tables t LEFT JOIN information_schema.innodb_tablespaces i ON i.name LIKE CONCAT(t.table_schema, '/', t.table_name, IF(t.create_options <=> 'partitioned', '#p#%', '')) COLLATE utf8mb3_general_ci WHERE t.table_schema = database() GROUP BY t.table_schema, t.table_name, t.table_type, t.create_time, t.table_comment;
...
3004 rows in set (13.43 sec)

mysql> SELECT t.table_name, t.table_type, UNIX_TIMESTAMP(t.create_time), t.table_comment, SUM(i.file_size), SUM(i.allocated_size) FROM information_schema.tables t LEFT JOIN information_schema.innodb_tablespaces i ON SUBSTRING_INDEX(i.name, '#p#p', 1) = CONCAT(t.table_schema, '/', t.table_name) COLLATE utf8mb3_general_ci WHERE t.table_schema = database() GROUP BY t.table_schema, t.table_name, t.table_type, t.create_time, t.table_comment;
...
3004 rows in set (0.05 sec)

And it still works properly with partitioned tables:

mysql> CREATE TABLE trb3 (id INT, name VARCHAR(50), purchased DATE)
    ->     PARTITION BY RANGE( YEAR(purchased) ) (
    ->         PARTITION p0 VALUES LESS THAN (1990),
    ->         PARTITION p1 VALUES LESS THAN (1995),
    ->         PARTITION p2 VALUES LESS THAN (2000),
    ->         PARTITION p3 VALUES LESS THAN (2005)
    ->     );
Query OK, 0 rows affected (0.02 sec)

mysql> SELECT t.table_name, t.table_type, UNIX_TIMESTAMP(t.create_time), t.table_comment, SUM(i.file_size), SUM(i.allocated_size) FROM information_schema.tables t LEFT JOIN information_schema.innodb_tablespaces i ON SUBSTRING_INDEX(i.name, '#p#p', 1) = CONCAT(t.table_schema, '/', t.table_name) COLLATE utf8mb3_general_ci WHERE t.table_schema
= database() GROUP BY t.table_schema, t.table_name, t.table_type, t.create_time, t.table_comment having t.table_name = "trb3" or t.table_name = "test1";
+------------+------------+-------------------------------+---------------+------------------+-----------------------+
| TABLE_NAME | TABLE_TYPE | UNIX_TIMESTAMP(t.create_time) | TABLE_COMMENT | SUM(i.file_size) | SUM(i.allocated_size) |
+------------+------------+-------------------------------+---------------+------------------+-----------------------+
| test1      | BASE TABLE |                    1729832253 |               |           114688 |                114688 |
| trb3       | BASE TABLE |                    1729832315 |               |           458752 |                458752 |
+------------+------------+-------------------------------+---------------+------------------+-----------------------+
2 rows in set (0.05 sec)

With only the timeout related changes the manual test in #17059 ALSO passes. The reason being that we're able to get past our schema reload work and reach the desired position within the larger timeout (the query blocking us takes ~ 13 seconds).

So we have two fixes for the underlying issue that not only solves the OnlineDDL specific aspects but also makes Vitess much more efficient overall when there are thousands of tables.

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

This avoids the table scan on <temporary> stage from the query
execution in the middle here:
EXPLAIN: -> Filter: (`TABLE_NAME` like 'trb3%')  (actual time=1030..1030 rows=1 loops=1)
    -> Table scan on <temporary>  (actual time=1030..1030 rows=3005 loops=1)
        -> Aggregate using temporary table  (actual time=1030..1030 rows=3005 loops=1)
...

Signed-off-by: Matt Lord <[email protected]>
Copy link
Contributor

vitess-bot bot commented Oct 25, 2024

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Oct 25, 2024
@github-actions github-actions bot added this to the v22.0.0 milestone Oct 25, 2024
Copy link

codecov bot commented Oct 25, 2024

Codecov Report

Attention: Patch coverage is 0% with 8 lines in your changes missing coverage. Please review.

Project coverage is 67.16%. Comparing base (be0bca3) to head (bed84fa).

Files with missing lines Patch % Lines
go/vt/vttablet/onlineddl/executor.go 0.00% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #17066      +/-   ##
==========================================
- Coverage   67.16%   67.16%   -0.01%     
==========================================
  Files        1571     1571              
  Lines      252093   252097       +4     
==========================================
+ Hits       169314   169315       +1     
- Misses      82779    82782       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mattlord mattlord added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: VReplication Component: Query Serving Component: Schema Tracker Component: Online DDL Online DDL (vitess/native/gh-ost/pt-osc) and removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says NeedsIssue A linked issue is missing for this Pull Request NeedsBackportReason If backport labels have been applied to a PR, a justification is required labels Oct 25, 2024
@mattlord mattlord changed the title Improve Schema engine's TablesWithSize80 query Improve Schema Engine's TablesWithSize80 query Oct 25, 2024
@mattlord mattlord marked this pull request as ready for review October 25, 2024 03:16
@mattlord mattlord removed the request for review from systay October 25, 2024 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Online DDL Online DDL (vitess/native/gh-ost/pt-osc) Component: Query Serving Component: Schema Tracker Component: VReplication Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug Report: OnlineDDL migrations fail when keyspace has thousands of tables
3 participants