-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[#1127] YSQL: Collation Support (part 3)
Summary: So far YSQL has added collation support by always performing collation encoding on any non-C collation string constant value that is sent to docdb. The collation encoded result can be memcmp'ed by docdb to achieve the same comparison semantics such that for two strings s1 and s2, their collation encoded results are e(s1), and e(s2), we have strcoll(s1, s2) == memcmp(e(s1), e(s2) Docdb is implemented as rocksdb which only performs memcmp on keys, not values. If a postgres table column is part of a primary key, it will be stored in the key part of the rocksdbs of the tablets for the table. If a postgres table column appears in an index, it will also be stored in the key part of the rocksdbs of the tablets for the index. In both cases, because they are stored in the key part of a rocksdb, collation-encoding is needed to ensure correct comparison semantics. However, if a postgres table column is neither part of a primary key, not it is used to build any index, then it is a non-key column and will be stored in the value part of rocksdb. Rocksdb does not perform memcmp on its value part, therefore performing collation-encoding is not needed. For space efficiency, we should store the original string value by removing the sortkey from the collation-encoded string. This diff implements this space optimization via the following steps: (1) Added PgDml::GetColumnInfo that for given column so that we can tell whether the column represents a primary key column. Note that for both YB base table and YB index table, a column is either a primary key column (that composes the primary key), or a value column. (2) At each bind point, for value column, change collation id to InvalidOid as encoding collation. This has the effect to disable collation encoding so that the original PG character string value will be passed to docdb. Test Plan: 1. Run regression tests with collation disabled (default build). 2. Run regression tests with collation enabled and default database collation is still "C" (FLAGS_TEST_pg_collation_enabled=true) 3. Run regression tests with collation enabled and set default database collation to "en_US.UTF-8" (FLAGS_TEST_pg_collation_enabled=true and kTestOnlyUseOSDefaultCollation=true). Reviewers: mihnea, dmitry Reviewed By: dmitry Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D12962
- Loading branch information
Showing
16 changed files
with
84 additions
and
33 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters