Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalidate query compilation cache entries with outdated VIEWs #1960

Merged
merged 3 commits into from
Feb 18, 2024

Conversation

jepett0
Copy link
Collaborator

@jepett0 jepett0 commented Feb 15, 2024

KIKIMR-21002

YDB caches query compilation results on the server side for efficiency. (For small queries compilation can take up to 100x times more time than the execution.) We search the cache entry by the text of the query. One can miss the cache just by adding a comment to the query. (Well, not anymore, because we have added cache by AST recently. Compilation takes much more time than AST building, so it makes sense.) The compilation result caching could theoretically cause problems even for the basic:

SELECT * FROM some_table_whose_name_stays_the_same_while_the_content_changes;

We could change the definition of this table in a separate session like this:

DROP TABLE some_table_...;
CREATE TABLE some_table_... ( /* different content */ );

and expect the select from this table to produce wrong results, because of the query cache, which will not notice the change in the definition of the table, because it is not apparent in the text of the query. However, there are some special mitigation mechanisms implemented for tables, which were missing for views up until this PR.

In this PR we add the following algorithm for invalidating cache entries for outdated VIEWs:

  1. Store path ids and schema versions of the views that were used in the query in the cache entries, so they can be accessed later.
  2. Whenever we retrieve a compilation result from cache, send a request for SchemeCache to check if the schema version of the views used in this query (if any) has not changed since we compiled this query.
  3. Send a recompilation request if any view is outdated.

There are two important things to note about this solution:

  • We make a SchemeCache request for each repeated query and there is a lot of these in an OLTP-focused database like YDB. However, we have already been sending these request for preliminary (this is not the last check of schema version mismatch (at least for tables)) cache invalidation for tables, so views should not incur an additional performance impact here.
  • This solution does not guarantee strong consistency for queries using views, because query cache invalidation will not happen instantly after the view definition is updated. The node should get an update from the SchemeCache, which takes some time.

Copy link

github-actions bot commented Feb 15, 2024

2024-02-15 06:49:49 UTC Pre-commit check for f57a49b has started.
2024-02-15 06:49:51 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-15 07:28:55 UTC Build successful.
2024-02-15 07:29:07 UTC Tests are running...
🔴 2024-02-15 08:57:42 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
67527 56512 0 4 10974 37

Copy link

github-actions bot commented Feb 15, 2024

2024-02-15 06:52:06 UTC Pre-commit check for f57a49b has started.
2024-02-15 06:52:09 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-15 07:33:02 UTC Build successful.
2024-02-15 07:33:16 UTC Tests are running...
🔴 2024-02-15 09:13:34 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14774 14563 0 21 130 60

optional uint32 OwnerId = 1;
optional uint32 TableId = 2;
optional uint64 OwnerId = 1;
optional uint64 TableId = 2;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from uint32 to uint64 produces binary compatible protobufs. See the docs of both proto2 and proto3, they have the same line on this topic:

int32, uint32, int64, uint64, and bool are all compatible – this means you can change a field from one of these types to another without breaking forwards- or backwards-compatibility

@jepett0 jepett0 marked this pull request as ready for review February 15, 2024 08:42
Copy link

github-actions bot commented Feb 15, 2024

2024-02-15 09:25:46 UTC Pre-commit check for 3eca3bf has started.
2024-02-15 09:25:49 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-15 09:31:34 UTC Build successful.
2024-02-15 09:31:49 UTC Tests are running...
🔴 2024-02-15 10:26:17 UTC Test run completed, no test results found for commit ae3f2a4. Please check build logs.
2024-02-15 10:26:20 UTC Check cancelled

Copy link

github-actions bot commented Feb 15, 2024

2024-02-15 09:26:55 UTC Pre-commit check for 3eca3bf has started.
2024-02-15 09:26:58 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-15 09:32:25 UTC Build successful.
2024-02-15 09:32:36 UTC Tests are running...
🔴 2024-02-15 10:26:16 UTC Test run completed, no test results found for commit ae3f2a4. Please check build logs.
2024-02-15 10:26:19 UTC Check cancelled

Copy link

github-actions bot commented Feb 15, 2024

2024-02-15 10:27:43 UTC Pre-commit check for 35cbd89 has started.
2024-02-15 10:27:44 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-15 11:10:40 UTC Build successful.
2024-02-15 11:10:51 UTC Tests are running...
🔴 2024-02-15 12:56:11 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14789 14585 0 18 143 43

Copy link

github-actions bot commented Feb 15, 2024

2024-02-15 10:29:59 UTC Pre-commit check for 35cbd89 has started.
2024-02-15 10:30:01 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-15 11:08:39 UTC Build successful.
2024-02-15 11:08:51 UTC Tests are running...
🔴 2024-02-15 12:41:30 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
67532 56518 0 3 10972 39

…ery was updated

Add views, whose metadata was loaded during the compilation of the query, info to the prepared query proto and use the list to check (with SchemeCache) if the schema version of the view was updated and a recompilation is needed.
@jepett0 jepett0 force-pushed the VIEWs.invalidate_query_cache.1 branch from e926c59 to 02df54f Compare February 16, 2024 08:28
Copy link

github-actions bot commented Feb 16, 2024

2024-02-16 08:29:19 UTC Pre-commit check for 685d6a2 has started.
2024-02-16 08:29:21 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-16 09:11:10 UTC Build successful.
2024-02-16 09:11:20 UTC Tests are running...
🔴 2024-02-16 09:45:54 UTC Test run completed, no test results found for commit 02df54f. Please check build logs.
2024-02-16 09:45:57 UTC Check cancelled

Copy link

github-actions bot commented Feb 16, 2024

2024-02-16 08:31:56 UTC Pre-commit check for 685d6a2 has started.
2024-02-16 08:31:59 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-16 09:09:19 UTC Build successful.
2024-02-16 09:09:32 UTC Tests are running...
🔴 2024-02-16 09:45:55 UTC Test run completed, no test results found for commit 02df54f. Please check build logs.
2024-02-16 09:45:58 UTC Check cancelled

Copy link

github-actions bot commented Feb 16, 2024

2024-02-16 09:48:13 UTC Pre-commit check for 60d013b has started.
2024-02-16 09:48:16 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-16 09:50:03 UTC Build successful.
2024-02-16 09:50:15 UTC Tests are running...
🔴 2024-02-16 11:14:57 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
67529 56517 0 10 10972 30

Copy link

github-actions bot commented Feb 16, 2024

2024-02-16 09:49:04 UTC Pre-commit check for 60d013b has started.
2024-02-16 09:49:07 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-16 09:50:55 UTC Build successful.
2024-02-16 09:51:06 UTC Tests are running...
🔴 2024-02-16 11:25:21 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14768 14590 0 15 126 37

gridnevvvit
gridnevvvit previously approved these changes Feb 16, 2024
Copy link

github-actions bot commented Feb 16, 2024

2024-02-16 15:27:22 UTC Pre-commit check for 164d12d has started.
2024-02-16 15:27:24 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-16 16:09:19 UTC Build successful.
2024-02-16 16:09:29 UTC Tests are running...
🔴 2024-02-16 17:42:20 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
67529 56516 0 5 10972 36

Copy link

github-actions bot commented Feb 16, 2024

2024-02-16 15:28:02 UTC Pre-commit check for 164d12d has started.
2024-02-16 15:28:04 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-16 16:04:52 UTC Build successful.
2024-02-16 16:05:00 UTC Tests are running...
🔴 2024-02-16 17:46:35 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14768 14583 0 14 133 38

@jepett0 jepett0 merged commit b92e8dc into ydb-platform:main Feb 18, 2024
2 of 4 checks passed
jepett0 added a commit to jepett0/ydb that referenced this pull request Mar 6, 2024
…latform#1960)

In this PR we add the following algorithm for invalidating cache entries for outdated VIEWs:

1. Store path ids and schema versions of the views that were used in the query in the cache entries, so they can be accessed later.
2. Whenever we retrieve a compilation result from cache, send a request for SchemeCache to check if the schema version of the views used in this query (if any) has not changed since we compiled this query.
3. Send a recompilation request if any view is outdated.

There are two important things to note about this solution:

- We make a SchemeCache request for each repeated query and there is a lot of these in an OLTP-focused database like YDB. However, we have already been sending these request for preliminary (this is not the last check of schema version mismatch (at least for tables)) cache invalidation for tables, so views should not incur an additional performance impact here.
- This solution does not guarantee strong consistency for queries using views, because query cache invalidation will not happen instantly after the view definition is updated. The node should get an update from the SchemeCache, which takes some time.
jepett0 added a commit to jepett0/ydb that referenced this pull request Mar 6, 2024
…latform#1960)

KIKIMR-21002

In this PR we add the following algorithm for invalidating cache entries for outdated VIEWs:

1. Store path ids and schema versions of the views that were used in the query in the cache entries, so they can be accessed later.
2. Whenever we retrieve a compilation result from cache, send a request for SchemeCache to check if the schema version of the views used in this query (if any) has not changed since we compiled this query.
3. Send a recompilation request if any view is outdated.

There are two important things to note about this solution:

- We make a SchemeCache request for each repeated query and there is a lot of these in an OLTP-focused database like YDB. However, we have already been sending these request for preliminary (this is not the last check of schema version mismatch (at least for tables)) cache invalidation for tables, so views should not incur an additional performance impact here.
- This solution does not guarantee strong consistency for queries using views, because query cache invalidation will not happen instantly after the view definition is updated. The node should get an update from the SchemeCache, which takes some time.
jepett0 added a commit that referenced this pull request Mar 6, 2024
#2479)

KIKIMR-21002

In this PR we add the following algorithm for invalidating cache entries for outdated VIEWs:

1. Store path ids and schema versions of the views that were used in the query in the cache entries, so they can be accessed later.
2. Whenever we retrieve a compilation result from cache, send a request for SchemeCache to check if the schema version of the views used in this query (if any) has not changed since we compiled this query.
3. Send a recompilation request if any view is outdated.

There are two important things to note about this solution:

- We make a SchemeCache request for each repeated query and there is a lot of these in an OLTP-focused database like YDB. However, we have already been sending these request for preliminary (this is not the last check of schema version mismatch (at least for tables)) cache invalidation for tables, so views should not incur an additional performance impact here.
- This solution does not guarantee strong consistency for queries using views, because query cache invalidation will not happen instantly after the view definition is updated. The node should get an update from the SchemeCache, which takes some time.
@mregrock mregrock mentioned this pull request May 15, 2024
This was referenced Jun 7, 2024
@CyberROFL CyberROFL mentioned this pull request Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants