[YSQL] Extra Secondary Index Writes on Trigger Insert + Foreign Key Check #18822

karthik-ramanathan-3006 · 2023-08-23T16:31:14Z

Jira Link: DB-7701

Description

Consider the following schema:

CREATE TABLE IF NOT EXISTS trigger_table (
    h integer PRIMARY KEY NOT NULL
);

CREATE FUNCTION trigger_insert() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
  BEGIN
      INSERT INTO trigger_table (h) (SELECT NEW.h);
      RETURN NEW;
  END;
  $$;

CREATE TABLE main_table (
    h integer PRIMARY KEY NOT NULL,
    v1 integer NOT NULL,
    v2 integer NOT NULL,
    v3 integer
);

CREATE INDEX main_table_idx1 ON main_table (v1);
CREATE UNIQUE INDEX main_table_idx2 ON main_table (v2);
ALTER TABLE main_table ADD CONSTRAINT "main_table_fk" FOREIGN KEY (h) REFERENCES trigger_table(h);

CREATE TRIGGER trigger1 BEFORE INSERT ON main_table FOR EACH ROW EXECUTE FUNCTION trigger_insert();

This consists of a main table whose primary key has a foreign key constraint on a secondary table.
Additionally, the main table has a trigger, which inserts a new key into the secondary table.

Inserting a new row into the main table:

INSERT INTO main_table VALUES(0, 0, 0, 0);

When an UPDATE is issued to the non-index column column (v3) of the main table, we see write requests corresponding to DELETEs + INSERTs for each of the secondary indexes.

yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE main_table SET v3 = 1 WHERE h = 0;
                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Update on main_table  (cost=0.00..4.11 rows=1 width=88) (actual time=5.575..5.576 rows=0 loops=1)
   ->  Index Scan using main_table_pkey on main_table  (cost=0.00..4.11 rows=1 width=88) (actual time=1.769..1.778 rows=1 loops=1)
         Index Cond: (h = 0)
         Storage Table Read Requests: 1
         Storage Table Read Execution Time: 0.888 ms
         Storage Table Write Requests: 1
         Storage Index Write Requests: 4
         Storage Flush Requests: 1
         Storage Flush Execution Time: 2.427 ms
 Planning Time: 0.192 ms
 Execution Time: 7.172 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 0.888 ms
 Storage Write Requests: 5
 Catalog Read Requests: 1
 Catalog Read Execution Time: 3.390 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 2
 Storage Flush Execution Time: 3.704 ms
 Storage Execution Time: 7.983 ms
 Peak Memory Usage: 24 kB
(21 rows)

This query should ideally have been executed as a single main table write request in a single flush.
Instead we see 4 secondary index writes + 1 main table write across 2 flushes.

Attached is a gist containing the schema:
https://gist.github.com/karthik-ramanathan-3006/8fa30b3d56829e62e228382ada3ae3b3

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

The text was updated successfully, but these errors were encountered:

karthik-ramanathan-3006 · 2023-09-07T19:42:05Z

Found a simpler reproduction for this issue. Consider a table test with the following schema:

-- Create test
CREATE TABLE test(k INT PRIMARY KEY, v1 INT, v2 INT);
CREATE INDEX test_v ON test(v1);

-- Create parent table + FK constraint
CREATE TABLE parent(h INT PRIMARY KEY);
INSERT INTO parent VALUES(1);
ALTER TABLE test ADD CONSTRAINT "test_fk" FOREIGN KEY (k) REFERENCES parent(h);

-- Create a simple trigger on test
CREATE FUNCTION trigger_print_notice() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
    BEGIN
        RAISE NOTICE 'Trigger called with new values ("%", "%", "%")', NEW.k, NEW.v1, NEW.v2;
        RETURN NEW;
    END;
    $$;

CREATE TRIGGER trigger_print BEFORE INSERT ON test FOR EACH ROW EXECUTE FUNCTION trigger_print_notice();

The table test now looks as follows:

yugabyte=# \d test
                Table "public.test"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 k      | integer |           | not null |
 v1     | integer |           |          |
 v2     | integer |           |          |
Indexes:
    "test_pkey" PRIMARY KEY, lsm (k HASH)
    "test_v" lsm (v1 HASH)
Foreign-key constraints:
    "test_fk" FOREIGN KEY (k) REFERENCES parent(h)
Triggers:
    trigger_print BEFORE INSERT ON test FOR EACH ROW EXECUTE PROCEDURE trigger_print_notice()

Let us insert a sample row into test to ensure that the BEFORE INSERT trigger fires.

yugabyte=# INSERT INTO test VALUES(1, 1, 1);
NOTICE:  Trigger called with new values ("1", "1", "1")
INSERT 0 1

Performing an update on non-index column v2 leads to unnecessary writes:

yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE test SET v2 = 3 WHERE k = 1;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Update on test  (cost=0.00..4.11 rows=1 width=80) (actual time=14.941..14.941 rows=0 loops=1)
   ->  Index Scan using test_pkey on test  (cost=0.00..4.11 rows=1 width=80) (actual time=7.278..7.288 rows=1 loops=1)
         Index Cond: (k = 1)
         Storage Table Read Requests: 1
         Storage Table Read Execution Time: 3.676 ms
         Storage Table Write Requests: 1
         Storage Index Write Requests: 2
         Storage Flush Requests: 1
         Storage Flush Execution Time: 5.465 ms
 Planning Time: 0.341 ms
 Execution Time: 18.260 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 3.676 ms
 Storage Write Requests: 3
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 2
 Storage Flush Execution Time: 8.343 ms
 Storage Execution Time: 12.019 ms
 Peak Memory Usage: 24 kB
(20 rows)

A similar result on change the WHERE clause:

yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE test SET v2 = 3 WHERE v2 = 3;
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Update on test  (cost=0.00..102.50 rows=1000 width=80) (actual time=14.522..14.522 rows=0 loops=1)
   ->  Seq Scan on test  (cost=0.00..102.50 rows=1000 width=80) (actual time=4.518..4.533 rows=1 loops=1)
         Remote Filter: (v2 = 3)
         Storage Table Read Requests: 1
         Storage Table Read Execution Time: 2.782 ms
         Storage Table Write Requests: 1
         Storage Index Write Requests: 2
         Storage Flush Requests: 1
         Storage Flush Execution Time: 8.581 ms
 Planning Time: 0.126 ms
 Execution Time: 17.401 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 2.782 ms
 Storage Write Requests: 3
 Catalog Read Requests: 1
 Catalog Read Execution Time: 6.122 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 2
 Storage Flush Execution Time: 11.013 ms
 Storage Execution Time: 19.917 ms
 Peak Memory Usage: 55 kB
(21 rows)

Summary: A table with a FK constrain and a trigger (which are in no way related to each other) causes writes to all secondary indexes in the table.

… checks when relevant columns not modified Summary: **Background** Prior to this revision, an UPDATE statement specifying a list of target columns X in its SET clause, **always** performed the necessary work to update each of the target columns in the storage layer, irrespective of whether the values of the columns actually changed. The necessary work could include requiring locks, updating indexes, checking of constraints, firing of triggers etc. **The Optimization** This revision introduces an optimization that validates that the values of a column are indeed being modified, before sending (flushing) the updated value of the column to the storage layer. In particular, the set of columns whose values that are compared are those that can cause extra round trips to the storage layer in the form of: - Primary Key Updates - Secondary Index Updates - Foreign Key Constraints - Uniqueness Constraints The matrix of columns that are marked for update and the objects (indexes, constraints) they impact are computed at planning time. This is particularly useful when used in conjunction with prepared statements and ORMs, which tend to specify all columns (both modified and non-modified) as part of the target list. The decision of whether a column is indeed modified is done on a per-tuple basis at execution time. **Example** As a concrete example, consider a table with the following schema and data: ``` yugabyte=# CREATE TABLE foo (h INT PRIMARY KEY, v1 INT, v2 INT, v3 INT); yugabyte=# CREATE INDEX foo_v1_idx ON foo (v1); yugabyte=# CREATE INDEX foo_v2_idx ON foo (v2); yugabyte=# INSERT INTO foo (SELECT i, i, i % 10, i % 100 FROM generate_series(1, 10000) AS i); ``` Performing an UPDATE on the first 1000 rows (without the optimization) yields: ``` yugabyte=# SET yb_explain_hide_non_deterministic_fields TO true; yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE foo SET h = v1, v1 = v1, v3 = v3 + 1 WHERE v1 <= 1000; QUERY PLAN ------------------------------------------------------------------------------------------ Update on foo (cost=0.00..105.00 rows=1000 width=88) (actual rows=0 loops=1) -> Seq Scan on foo (cost=0.00..105.00 rows=1000 width=88) (actual rows=1000 loops=1) Remote Filter: (v1 <= 1000) Storage Table Read Requests: 1 Storage Table Rows Scanned: 10000 Storage Table Write Requests: 2000 Storage Index Write Requests: 4000 Storage Flush Requests: 2000 Storage Read Requests: 1 Storage Rows Scanned: 10000 Storage Write Requests: 6000 Storage Flush Requests: 2001 (12 rows) ``` The values of `h` and `v1` are not modified by the query, yet result in multiple write requests to both the main table as well as the secondary indices. Since updates to key columns (of a table or an index) is executed as a sequence of a DELETE followed by an INSERT, this query requires a large amount of flushes. This makes the query very expensive in terms of the amount of work to be done. With the proposed optimization the query is executed as follows: ``` yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE foo SET h = v1, v1 = v1, v3 = v3 + 1 WHERE v1 <= 1000; QUERY PLAN ------------------------------------------------------------------------------------------ Update on foo (cost=0.00..105.00 rows=1000 width=88) (actual rows=0 loops=1) -> Seq Scan on foo (cost=0.00..105.00 rows=1000 width=88) (actual rows=1000 loops=1) Remote Filter: (v1 <= 1000) Storage Table Read Requests: 1 Storage Rows Scanned: 10000 Storage Table Write Requests: 1000 Storage Read Requests: 1 Storage Rows Scanned: 10000 Storage Write Requests: 1000 Storage Flush Requests: 1 (10 rows) ``` **Flags and Feature Status** This revision introduces the following GUCs to control the behavior of this optimization: `yb_update_num_cols_to_compare` - The maximum number of columns to be compared. (default: 0) `yb_update_max_cols_size_to_compare` - The maximum size of an individual column that can be compared. (default: 10240) This feature is currently turned off as a result of setting `yb_update_num_cols_to_compare` to 0. **Debuggability** Turn on postgres debug2 logs via the following command: ``` ./bin/yb-ctl restart --ysql_pg_conf_csv='log_min_messages=debug2' ``` This produces the following debug information: ``` -- At planning time 2024-07-31 10:59:07.124 PDT [76120] DEBUG: Update matrix: rows represent OID of entities, columns represent attnum of cols 2024-07-31 10:59:07.124 PDT [76120] DEBUG: - 10 2024-07-31 10:59:07.124 PDT [76120] DEBUG: 17415 Y -- At execution time, on a per-tuple basis 2024-07-31 10:59:07.143 PDT [76120] DEBUG: Index/constraint with oid 17415 requires an update 2024-07-31 10:59:07.143 PDT [76120] DEBUG: Relation: 17412 Columns that are inspected and modified: 1 (10) 2024-07-31 10:59:07.143 PDT [76120] DEBUG: No cols in category: Columns that are inspected and unmodified 2024-07-31 10:59:07.143 PDT [76120] DEBUG: Relation: 17412 Columns that are marked for update: 1 (10) 2 (11) ``` **Future Work** 1. Introduce auto-flag infrastructure to safely use row-locking. This is in the context of upgrade safety while the cluster is being upgraded. 2. As a part of the flag infrastructure, ensure that flags/GUC values are immutable during the lifetime of a query. 3. #22994: PGSQL_UPDATEs with no column references should acquire row locks. 4. #23348: Add support for partitioned tables with out of order columns. 5. Support for serializing optimization metadata in plans. 6. Enhance randgen grammar to support ModifyTable (INSERT/UPDATE/DELETE ) queries 7. #23350: PG 15 support. jenkins: urgent Jira: DB-7701 Test Plan: Run the associated pg_regress test as follows: ``` # New tests ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule' # Existing tests ./yb_build.sh --java-test 'org.yb.pgsql.TestPgUpdatePrimaryKey' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgUniqueConstraint' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressTrigger#testPgRegressTrigger' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressDml#testPgRegressDml' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPushdown#testPgRegressPushdown' ``` Tested scenarios include (but not limited to): 1. Single row and distributed transactions with and without the feature flag turned on. 2. Relations with a primary key and no secondary indexes or triggers (UPDATEs can take the single row path) 3. Relations with combinations of primary key and secondary indexes. 4. Relations with unconditional before-row triggers. 5. UPDATEs in Colocated databases. 6. UPDATEs covering multiple tuples. 7. Hierarchy of relations with foreign keys 8. Relations with self referential foreign keys 9. Relations with overlapping indexes. 10. Relations having columns with uniqueness constraints. 11. Relations having covering indexes. 12. Relations having partial indexes. 13. Relations having index expressions / predicates. 14. Relations with conditional column triggers. 15. Relations having indexes/constraints out of order (ie. order of columns in relation is different from that of entity) 16. Relations having combination of hash and range indexes. 17. UPDATEs with correlated subqueries. 18. INSERT ON CONFLICT DO UPDATE. 19. UPDATE RETURNING. 20. UPDATEs on temp tables. Reviewers: mihnea, jason, amartsinchyk Reviewed By: amartsinchyk Subscribers: pjain, jason, smishra, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34040

Summary: 66890cb [PLAT-14781] Add runtime config for download metrics as pdf b2f24f2 [#9797] YSQL: Stubs for ysql major version upgrade RPCs 5891faa [#23332] Docdb: Fix yb-admin to handle snapshot_retention correctly 0614c52 [PLAT-14756] Releases API RBAC cleanup 2cfb999 [docs] Release notes for 2024.1.1.0-b137 (#23185) d868a03 [PLAT-13495] Do not run dual nic steps on systemd upgrade 10bc447 [PLAT-14798] Adding internal load balancer to devspace 7296df8 [docs] [2.23] Add pg_cron extension docs (#22546) 79902ae [PLAT-14678] - feat : Export All Metrics to pdf 8a0b95c [#23322] YSQL: pg_partman: fix logic of checking existing table Excluded: 63f471a [#18822] YSQL: Framework to skip redundant sec index updates and fkey checks when relevant columns not modified 3040472 [PLAT-14783] [PGParity] After Edit RR cores scale up on PG Parity enabled cluster, the RR node does not have PGP gflags e052089 [PLAT-14774] Per process tserver metrics is not working if YSQL is disabled 0c664a1 [#22370] docdb: Cost Based Optimizer changes to take into account backward scans improvement a060877 [PLAT-13712]fix plan info for Azure VMs 291dd40 Remove 6sense domains from CSP headers (#23354) 75cb273 [#23330] docdb: fixed static columns handling for CQL operations Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: jason, tfoucher Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36984

… sec index updates and fkey checks when relevant columns not modified Summary: **Conflict Resolution for PG15 cherrypick** - src/postgres/src/include/nodes/plannodes.h: - Location: struct ModifyTable definition: - My master commit adds two new fields (`yb_update_affected_entities`, `yb_skip_entities`) and subsumes (`no_update_index_list`) - YB-PG 15 has added new fields (`ybUseScanTupleInUpdate`, `ybHasWholeRowAttribute`) - Merge resolution: Keep newly added fields from both, remove `no_update_index_list` - Location: imports - My master commit adds the import `nodes/ybbitmatrix.h` - YB-PG 15 changes `nodes/relation.h` to `access/relation.h` and adds `nodes/parsenodes.h`, `nodes/pathnodes.h` - Merge resolution: Add nodes/ybbitmatrix.h, nodes/parsenodes.h, nodes/pathnodes.h and change nodes/relation.h to access/relation.h - src/postgres/src/include/executor/executor.h - Location: “prototypes from functions in execIndexing.c” - My master commit removes the `no_update_index_list` arg from ExecInsertIndexTuples and ExecDeleteIndexTuplesOptimized and removes the function `ContainsIndexRelation` - YB-PG 15 has rearranged the location of function definitions, added a `ResultRelInfo` field to the *IndexTuple functions - Merge resolution: Keep functions in rearranged locations, add `ResultRelInfo` and remove `no_update_index_list` to ExecInsertIndexTuples, remove ExecDeleteIndexTuplesOptimized altogether, remove `ContainsIndexRelation` - src/postgres/src/backend/utils/misc/guc.c: - Location: integer GUCs - My master commit adds new GUCs (`yb_update_num_cols_to_compare`, `yb_update_max_cols_size_to_compare`) - This caused an adjacency conflict with YB-PG 15 (41c091a) that changed QUERY_TUNING to QUERY_TUNING_OTHER for `yb_parallel_range_rows`. - Merge resolution: Add my new GUCs and change QUERY_TUNING_OTHER for `yb_parallel_range_rows`. - src/postgres/src/include/utils/relcache.h: - Location: “Routines to compute/retrieve additional cached information” - My master commit adds a new function `YbComputeIndexExprOrPredicateAttrs`. - YB-PG 15 adds new functions `RelationGetIdentityKeyBitmap` (upstream PG commit: e7eea52b2d61917fbbdac7f3f895e4ef636e935b), `RelationGetIndexPredicate` and `RelationGetIndexRawAttOptions` (upstream PG commit: 911e70207703799605f5a0e8aad9f06cff067c63) causing adjacency conflicts. Further, 55782d5 moves down `CheckIndexForUpdate` declaration. - Merge resolution: Add all new functions, move down `CheckIndexForUpdate` declaration - src/postgres/src/backend/utils/cache/relcache.c: - Location: IsProjectionFunctionalIndex - My master commit adds a new function `YbComputeIndexExprOrPredicateAttrs` - YB-PG 15 removed a function in the same area: `IsProjectionFunctionalIndex` - Merge resolution: Add `YbComputeIndexExprOrPredicateAttrs`, remove `IsProjectionFunctionalIndex` - Location: YbRelationGetFKeyReferencedByList - Not a merge conflict. In YB-PG 15, `DeconstructFkConstraintRow` has additional outparams to retrieve ON DELETE SET NULL/DEFAULT cols. This info is not needed, hence the params are set to NULL. - Not a merge conflict. Fetch the constraint oid from `Form_pg_constraint` instead of fetching it from the HeapTuple directly. - src/postgres/src/include/commands/trigger.h: - Location: “in utils/adt/ri_triggers.c” - My master commit adds a new param `yb_skip_entities` to `RI_FKey_pk_upd_check_required` and `RI_FKey_fk_upd_check_required` - YB-PG 15 uses TupleTableSlots instead of HAeapTuples in the function definition + lint changes - Resolution: Added new param and replaced HeapTuples with TupleTableSlots - src/postgres/src/backend/utils/adt/ri_triggers.c: - Same as above - src/postgres/src/backend/commands/trigger.c: - Location: AfterTriggerSaveEvent - Same context as above - Additionally, YB-PG 15 skips foreign key update checks on partitioned tables (upstream PG commit: ba9a7e392171c83eb3332a757279e7088487f9a2). This behavior is retained in the conflict resolution. - src/postgres/src/backend/optimizer/util/ybcplan.c: - Location: imports - My master commit moved around the location of `yb/yql/pggate/ybc_pggate.h` (albeit accidentally) - YB-PG 15 (55782d5) moves `ybcplan.h` out of the "YB includes" section to under "postgres.h" - Merge resolution: Retain locations of “ybc_pggate.h” and “ybcplan.h” that are already present YB-PG 15 - src/postgres/src/backend/optimizer/plan/createplan.c: - Location: “create_modifytable_plan” - My master commit adds a local variable `yb_is_single_row_update_or_delete` - YB-PG 15 has rearranged the location of the variable declarations - Merge resolution: Retain PG 15’s rearrangement, add my local variable - src/postgres/src/backend/executor/execIndexing.c: - Location: ExecInsertIndexTuples - My master commit updated function definition as per `executor/executor.h` - In YB-PG 15, YB introduced function `ExecInsertIndexTuplesOptimized` was deleted during the pg15 initial merge 55782d5, causing a conflict. - Merge resolution: Remove `no_update_index_list` arg from ExecInsertIndexTuples and remove `ExecInsertIndexTuplesOptimized`. - Location: ExecInsertIndexTuples - PG 15 introduced the notion of hints to the storage in case an index is not modified. Yugabyte does not require this as this computation is already being done (and enforced). - Merge resolution: The indexUnchanged hint will always evaluate to false for Yugabyte relations. Added a comment explaining this decision. - Location: ExecDeleteIndexTuples - My master commit updated function definition as per `executor/executor.h` - In YB-PG 15, an extra parameter `resultRelInfo` is added. - Merge resolution: Removed `ExecDeleteIndexTuplesOptimized` and subsumed functionality into `ExecDeleteIndexTuples` as there is no longer a difference between the two implementations from a function signature point of view. Added parameter `resultRelInfo` to ExecDeleteIndexTuples - src/postgres/src/backend/commands/copyfrom.c - Location: CopyFrom, CopyMultiInsertBufferFlush - Same context as above. - Merge resolution: Remove `no_update_index_list` from all invocations of `ExecInsertIndexTuples`. - src/postgres/src/backend/executor/execReplication.c - Location: ExecSimpleRelationInsert, ExecSimpleRelationUpdate - Same context as above. - Merge resolution: Remove `no_update_index_list` from all invocations of `ExecInsertIndexTuples`. - src/postgres/src/backend/executor/nodeModifyTable.c - Location: imports - My master commit adds `executor/ybOptimizeModifyTable.h` - YB-PG 15 moved imports from “YB includes” to “Yugabyte includes” and removed extra/unused imports - Merge resolution: Imports moved to “Yugabyte includes” and added `executor/ybOptimizeModifyTable.h` - Location: YBEqualDatums and YBBuildExtraUpdatedCols - Removed these functions as their functionality has moved to `executor/ybOptimizeModifyTable.h` - Location: ExecUpdate - My master commit introduces changes to compute a list of columns modified by the query for a given tuple - In YB-PG 15, ExecUpdate functionality has been broken into ExecUpdatePrologue, ExecUpdateAct, ExecUpdate and ExecUpdateEpilogue. - Merge resolution: Moved my changes into ExecUpdateAct and ExecUpdateEpilogue. - Location: ExecModifyTable - My master commit introduces a function to compute if a tuple in an UPDATE or a DELETE query has the “wholerow” junk attribute. - In YB-PG 15, this logic is propagated from planning time via `plan->ybHasWholeRowAttribute`. - Merge resolution: Use the logic in YB-PG 15. - Location: YBCHasWholeRowJunkAttr - Same context as above. - Merge resolution: Remove this function - Location: ExecInsert - Merge resolution: Remove `no_update_index_list` from four invocations of `ExecInsertIndexTuples`. - Location: YBExecUpdateAct - Not a conflict, but use of`ExecMaterializeSlot` has been changed to `ExecFetchSlotHeapTuple`. It is needed to copy the slot out to a HeapTuple in order to if columns are modified by the update query. - src/postgres/src/backend/rewrite/rewriteHandler.c - Location: YbAddWholeRowAttrIfNeeded - My master commit adds a new function `YbAddWholeRowAttrIfNeeded` - Conflict on PG side is upstream PG 41531e42d34f4aca117d343b5e40f3f757dec5fe and ed4653db8ca7a70ba7a4d329a44812893f8e59c2 adding code at end of rewriteValuesRTE. - Merge resolution: Remove all changes from YB master, retain YB-PG 15 changes. This removes functionality which will be re-evaluated and added back in a future diff. - src/include/executor/ybOptimizeModifyTable.h - Location: imports - Not a merge conflict, added a new include `nodes/execnodes.h` - Location: YbComputeModifiedColumnsAndSkippableEntities - Added a new arg `ResultRelInfo *` to specify the relation whose modified columns are to be computed. Previously, this info was fetched from EState which contained a single relation. - In YB-PG 15, Estate has been modified to include a list of relations. - src/postgres/src/backend/executor/ybOptimizeModifyTable.c - Location: YBEqualDatums - Not a merge conflict, but initialization of `FunctionCallInfoData` changed in YB-PG 15 for call information to be variable length (reference upstream PG commit `a9c35cf85ca1ff72f16f0f10d7ddee6e582b62b8`). - Changed this function to use new initialization. - Location: YbComputeModifiedColumnsAndSkippableEntities - Added a new arg `ResultRelInfo *` to specify the relation whose modified columns are to be computed. Previously, this info was fetched from EState which contained a single relation. - In YB-PG 15, Estate has been modified to include a list of relations. - src/postgres/src/backend/nodes/Makefile - Location: OBJS - My master commit adds `ybbitmatrix.o` object file. - In YB-PG 15, the list of objects was reformatted to have one object/file per line - Merge resolution: Added `ybbitmatrix.o` to the head of the list, retaining the new format - src/postgres/src/backend/executor/Makefile - Location: OBJS - Resolve as in ad2fedc for ybOptimizeModifyTable.o added by YB master **Background** Prior to this revision, an UPDATE statement specifying a list of target columns X in its SET clause, **always** performed the necessary work to update each of the target columns in the storage layer, irrespective of whether the values of the columns actually changed. The necessary work could include requiring locks, updating indexes, checking of constraints, firing of triggers etc. **The Optimization** This revision introduces an optimization that validates that the values of a column are indeed being modified, before sending (flushing) the updated value of the column to the storage layer. In particular, the set of columns whose values that are compared are those that can cause extra round trips to the storage layer in the form of: - Primary Key Updates - Secondary Index Updates - Foreign Key Constraints - Uniqueness Constraints The matrix of columns that are marked for update and the objects (indexes, constraints) they impact are computed at planning time. This is particularly useful when used in conjunction with prepared statements and ORMs, which tend to specify all columns (both modified and non-modified) as part of the target list. The decision of whether a column is indeed modified is done on a per-tuple basis at execution time. **Example** As a concrete example, consider a table with the following schema and data: ``` yugabyte=# CREATE TABLE foo (h INT PRIMARY KEY, v1 INT, v2 INT, v3 INT); yugabyte=# CREATE INDEX foo_v1_idx ON foo (v1); yugabyte=# CREATE INDEX foo_v2_idx ON foo (v2); yugabyte=# INSERT INTO foo (SELECT i, i, i % 10, i % 100 FROM generate_series(1, 10000) AS i); ``` Performing an UPDATE on the first 1000 rows (without the optimization) yields: ``` yugabyte=# SET yb_explain_hide_non_deterministic_fields TO true; yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE foo SET h = v1, v1 = v1, v3 = v3 + 1 WHERE v1 <= 1000; QUERY PLAN ------------------------------------------------------------------------------------------ Update on foo (cost=0.00..105.00 rows=1000 width=88) (actual rows=0 loops=1) -> Seq Scan on foo (cost=0.00..105.00 rows=1000 width=88) (actual rows=1000 loops=1) Remote Filter: (v1 <= 1000) Storage Table Read Requests: 1 Storage Table Rows Scanned: 10000 Storage Table Write Requests: 2000 Storage Index Write Requests: 4000 Storage Flush Requests: 2000 Storage Read Requests: 1 Storage Rows Scanned: 10000 Storage Write Requests: 6000 Storage Flush Requests: 2001 (12 rows) ``` The values of `h` and `v1` are not modified by the query, yet result in multiple write requests to both the main table as well as the secondary indices. Since updates to key columns (of a table or an index) is executed as a sequence of a DELETE followed by an INSERT, this query requires a large amount of flushes. This makes the query very expensive in terms of the amount of work to be done. With the proposed optimization the query is executed as follows: ``` yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE foo SET h = v1, v1 = v1, v3 = v3 + 1 WHERE v1 <= 1000; QUERY PLAN ------------------------------------------------------------------------------------------ Update on foo (cost=0.00..105.00 rows=1000 width=88) (actual rows=0 loops=1) -> Seq Scan on foo (cost=0.00..105.00 rows=1000 width=88) (actual rows=1000 loops=1) Remote Filter: (v1 <= 1000) Storage Table Read Requests: 1 Storage Rows Scanned: 10000 Storage Table Write Requests: 1000 Storage Read Requests: 1 Storage Rows Scanned: 10000 Storage Write Requests: 1000 Storage Flush Requests: 1 (10 rows) ``` **Flags and Feature Status** This revision introduces the following GUCs to control the behavior of this optimization: `yb_update_num_cols_to_compare` - The maximum number of columns to be compared. (default: 0) `yb_update_max_cols_size_to_compare` - The maximum size of an individual column that can be compared. (default: 10240) This feature is currently turned off as a result of setting `yb_update_num_cols_to_compare` to 0. **Debuggability** Turn on postgres debug2 logs via the following command: ``` ./bin/yb-ctl restart --ysql_pg_conf_csv='log_min_messages=debug2' ``` This produces the following debug information: ``` -- At planning time 2024-07-31 10:59:07.124 PDT [76120] DEBUG: Update matrix: rows represent OID of entities, columns represent attnum of cols 2024-07-31 10:59:07.124 PDT [76120] DEBUG: - 10 2024-07-31 10:59:07.124 PDT [76120] DEBUG: 17415 Y -- At execution time, on a per-tuple basis 2024-07-31 10:59:07.143 PDT [76120] DEBUG: Index/constraint with oid 17415 requires an update 2024-07-31 10:59:07.143 PDT [76120] DEBUG: Relation: 17412 Columns that are inspected and modified: 1 (10) 2024-07-31 10:59:07.143 PDT [76120] DEBUG: No cols in category: Columns that are inspected and unmodified 2024-07-31 10:59:07.143 PDT [76120] DEBUG: Relation: 17412 Columns that are marked for update: 1 (10) 2 (11) ``` **Future Work** 1. Introduce auto-flag infrastructure to safely use row-locking. This is in the context of upgrade safety while the cluster is being upgraded. 2. As a part of the flag infrastructure, ensure that flags/GUC values are immutable during the lifetime of a query. 3. #22994: PGSQL_UPDATEs with no column references should acquire row locks. 4. #23348: Add support for partitioned tables with out of order columns. 5. Support for serializing optimization metadata in plans. 6. Enhance randgen grammar to support ModifyTable (INSERT/UPDATE/DELETE ) queries 7. #23350: PG 15 support. jenkins: urgent Jira: DB-7701 Original commit: 63f471a / D34040 Test Plan: Run the associated pg_regress test as follows: ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgUpdatePrimaryKey' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgUniqueConstraint' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressTrigger#testPgRegressTrigger' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressDml#testPgRegressDml' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPushdown#testPgRegressPushdown' ``` Tested scenarios include (but not limited to): 1. Single row and distributed transactions with and without the feature flag turned on. 2. Relations with a primary key and no secondary indexes or triggers (UPDATEs can take the single row path) 3. Relations with combinations of primary key and secondary indexes. 4. Relations with unconditional before-row triggers. 5. UPDATEs in Colocated databases. 6. UPDATEs covering multiple tuples. 7. Hierarchy of relations with foreign keys 8. Relations with self referential foreign keys 9. Relations with overlapping indexes. 10. Relations having columns with uniqueness constraints. 11. Relations having covering indexes. 12. Relations having partial indexes. 13. Relations having index expressions / predicates. 14. Relations with conditional column triggers. 15. Relations having indexes/constraints out of order (ie. order of columns in relation is different from that of entity) 16. Relations having combination of hash and range indexes. 17. UPDATEs with correlated subqueries. 18. INSERT ON CONFLICT DO UPDATE. 19. UPDATE RETURNING. 20. UPDATEs on temp tables. Reviewers: jason, tfoucher Reviewed By: jason Subscribers: yql, smishra, jason, pjain Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D37350

…timizations are enabled Summary: D34040 introduced a framework to skip redundant secondary index updates and foreign key checks. As a part of the implementation, this revision skipped writing out unmodified columns to the main table. For correctness reasons, skipping writes of specific columns to the main table requires the acquisition of row-level locks. (In the event that no columns are modified, one is still required to acquire a row lock on the affected row) This created a dependency between the update optimization and the row locking feature. The latter is controlled by the autoflag `ysql_skip_row_lock_for_update`. This revision seeks to remove this dependency by continuing to write out unmodified columns to the main table. There is one notable exception to this behavior: unmodified columns that are a part of the primary key. If the value of the primary key remains unmodified, its columns are not written out to the main table as this would require an extra round trip to the storage layer. Jira: DB-7701 Test Plan: Run the following tests and ensure that there are no regressions: ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule' ``` Reviewers: amartsinchyk Reviewed By: amartsinchyk Subscribers: smishra, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D37545

Summary: f8049b8 [doc][xcluster][2024.1.2] Semi-automatic transactional xCluster (#23710) f2c8470 [docs][yugabyted][2024.1.2] Add Read Replica and xCluster examples (#23289) ea3157c [doc][yba][2024.1.2] Export database audit logs (#23605) Excluded: 8bae488 [#18822] YSQL: Write out unmodified cols to main table when update optimizations are enabled 5012632 [PLAT-15152] Upgrade YBC client and server version to 2.2.0.0-b5 cccb1e1 Fix YCQL index table workflow in docs (#23774) 9a3f5f7 [PLAT-15028] yba installer simplify postgres initdb a2b5e72 [docs] Release notes for 2024.1.2.0 (#23679) 9a9690d Fix 2024.1.2 build number (#23779) aec361b [doc] Voyager 1.8 (#23764) a2540e0 [DEVOPS-3185, DEVOPS-3114] Bump up frozen pip modules to latest versions compatible with py3.8 88f23dd [docs] [xcluster] Add min version for semi-automatic mode (#23776) 1d61e67 Revise the entire assessment page (#23784) 142c04a [PLAT-15089] HA sync immediately after promotion 315110f [PLAT-15132][PLAT-15153] Allow users to configure logging level for YNP, also minor bug fixes 82ff8d1 [PLAT-6779] Handle relative paths in yb_platform_backup.sh Excluded: 45c9cf8 [#23263] YSQL, ASH: Instrument more wait states Excluded: 9f8acff [#22148] YSQL, QueryDiagnostics: Adding support for Ash data. 39670e8 download links (#23790) 2674a79 [#23780] YSQL: Modify catalog version mismatch test assertions with Connection Manager enabled 3eb31b8 [#23777] yugabyted: update the gflags of pg parity to remove sized based fetching and add bitmap scans. fc5accd [docs] Enclose `allowed_preview_flags_csv` CSV parameters in brackets (#23758) Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: jason, tfoucher Differential Revision: https://phorge.dev.yugabyte.com/D37772

…o main table when update optimizations are enabled Summary: **Conflict resolution for PG 15 cherrypick** - src/postgres/src/backend/executor/nodeModifyTable.c - Location: ExecUpdate —> around `else if (IsYBRelation(resultRelationDesc))` - YB master 8bae4881dc89205c53ee3c62668c41347394af37 updates the comment explaining what `cols_marked_for_update` does. - In YB pg15 be8504df264aff0472e7e91264b095a30d068bf2, this code has moved into the function YBExecUpdateAct. - Merge resolution: Keep changes from YB pg15, manually update the comment in YBExecUpdateAct. - Location: ExecUpdate —> around `if (updateCxt.crossPartUpdate)` - YB master 8bae4881dc89205c53ee3c62668c41347394af37 frees the bitmapset `cols_marked_for_update`. - In YB pg15 be8504df264aff0472e7e91264b095a30d068bf2, this code has moved into the function YBExecUpdateAct. - Merge resolution: Keep changes from YB pg15, manually free the bitmapset in YBExecUpdateAct. D34040 introduced a framework to skip redundant secondary index updates and foreign key checks. As a part of the implementation, this revision skipped writing out unmodified columns to the main table. For correctness reasons, skipping writes of specific columns to the main table requires the acquisition of row-level locks. (In the event that no columns are modified, one is still required to acquire a row lock on the affected row) This created a dependency between the update optimization and the row locking feature. The latter is controlled by the autoflag `ysql_skip_row_lock_for_update`. This revision seeks to remove this dependency by continuing to write out unmodified columns to the main table. There is one notable exception to this behavior: unmodified columns that are a part of the primary key. If the value of the primary key remains unmodified, its columns are not written out to the main table as this would require an extra round trip to the storage layer. Jira: DB-7701 Original commit: 8bae488 / D37545 Test Plan: Run the following tests and ensure that there are no regressions: ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule' ``` Reviewers: jason, tfoucher Reviewed By: jason Subscribers: yql, smishra Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D37812

… optimization metadata Summary: D34040 introduced a framework to skip redundant secondary index updates and foreign key checks. As a part of this framework, metadata on skippable columns is computed at planning time and cached within the ModifyTable node. This carries the implication that the metadata may be cached as part of prepared statements and may be subject to serialization/deserialization. Consequently, this revision: - Adds a serialization/deserialization mechanism for update optimization metadata (specifically `YbUpdateAffectedEntities`; `YbSkippableEntities` was already implemented) - Adds a mechanism for determining equality for update optimization metadata (both `YbUpdateAffectedEntities`, `YbSkippableEntities`) - Adds tests around prepared statements with update optimizations enabled Jira: DB-7701 Test Plan: Run the following tests: ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule' ``` Additionally, the above test was run manually after adding the following line to test end-to-end serializability/deserializability: ```lang=diff --- a/src/postgres/src/backend/optimizer/plan/createplan.c +++ b/src/postgres/src/backend/optimizer/plan/createplan.c if (YbIsUpdateOptimizationEnabled() && ((!yb_is_single_row_update_or_delete && plan->operation == CMD_UPDATE) || (plan->operation == CMD_INSERT && plan->onConflictAction == ONCONFLICT_UPDATE))) { @@ -3491,6 +3492,7 @@ create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path) RelationClose(rel); } + plan = stringToNode(nodeToString(plan)); return plan; } ``` Reviewers: amartsinchyk, telgersma Reviewed By: telgersma Subscribers: smishra, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D37601

Summary: aec9a66 [doc][xcluster] Truncate limitation (#23833) fe8890d [PLAT-13984] Change default metric graph points count from 100 to 250 + make runtime configurable. 7d9b57b [#23869] YSQL: Fix one type of ddl atomicity stress test flakiness 7c1bca8 [PLAT-14052][PLAT-15237] :Add advanced date-time option,Restrict CP for K8s afce6ad [PLAT-14158] Support multiple transactional xCluster configs per universe on YBA UI 33342b3 [PLAT-15101] Add runtime config to turn on/off rollN for k8s 361a99a [#23809] CDCSDK: Filter out records corresponding to index tables in colocated DBs 8bbdf66 [docs] changed date (#23885) ee639f4 [#22104,#23506] YSQL: auto analyze service collects mutation counts from DDL and skips indexes 3dbf6da Delete architecture/design/multi-region-xcluster-async-replication.md e013578 [#23864] DocDB: Move cluster_uuid to server common flags 2b6a2d3 [PLAT-15062][PLAT-15071] Support DB scoped on UI and display schema change mode information 8260075 [PLAT-15079] Treat dropped on target tables as unconfigured but preselected f5169ca DocDB: Follow redirects for callhome, and fix URL eb61ef6 [PLAT-15228] Update package installation command for YBA e791c40 [#18822] YSQL: Add serialization/deserialization mechanism for update optimization metadata e69d8cb [doc] Backups and DDLs (#23840) e72ae64 [PLAT-14552]filter support bundle core files by date 8796c83 [PLAT-15274]: BackFill Pitr params for DR config and add not-null constraints in DB. dc9cc67 Fixed NPM test:ci build error 2e5ebef [PLAT-15287]: Add PITR Column and Restore Window Information to Backup List d053e45 [#238989]yugabyted: Node doesn't join using `--join` flag da4da45 [#23399] DocDB: Fix StopActiveTxnsPriorTo from using local variables in the call back Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: jason, tfoucher Differential Revision: https://phorge.dev.yugabyte.com/D38008

Summary: D34040 introduced a framework to skip redundant secondary index updates and foreign key checks. This revision adds a PG preview flag `ysql_yb_update_optimization_infra` that keeps the feature turned OFF by default. Additionally, this revision restricts lookup to this flag to the following locations:. - Once, during query rewriting to determine to set the wholerow junk attribute - Once, at planning time to determine if optimization metadata needs to be computed - Once, at the beginning of query execution (`ExecInitModifyTable`) to determine if we should go ahead with the optimization. This has three possibilities: - The feature was enabled at planning time and disabled now (during query execution). In this case optimization is not performed. - The feature was disabled at planning time and enabled now. In this case, there is no metadata to perform the optimization and so the optimization is skipped. - The features was enabled at planning time and enabled now. In this case, metadata is available and the optimization is performed. To turn on the update optimization feature, do the following: - Append `ysql_yb_update_optimization_infra` to the list of flags in `allowed_preview_flags_csv` - Set the tserver gflag `ysql_yb_update_optimization_infra` to true. Jira: DB-7701 Test Plan: Run the following tests: ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule' ``` Reviewers: amartsinchyk Reviewed By: amartsinchyk Subscribers: ybase, smishra, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D37746

Summary: 5d3e83e [PLAT-15199] Change TP API URLs according to latest refactoring a50a730 [doc][yba] YBDB compatibility (#23984) 0c84dbe [#24029] Update the callhome diagnostics not to send gflags details. b53ed3a [PLAT-15379][Fix PLAT-12510] Option to use UTC when dealing with cron exp. in backup schedule f0eab8f [PLAT-15278]: Fix DB Scoped XCluster replication restart 344bc76 Revert "[PLAT-15379][Fix PLAT-12510] Option to use UTC when dealing with cron exp. in backup schedule" 3628ba7 [PLAT-14459] Swagger fix bb93ebe [#24021] YSQL: Add --TEST_check_catalog_version_overflow 9ab7806 [#23927] docdb: Add gflag for minimum thread stack size Excluded: 8c8adc0 [#18822] YSQL: Gate update optimizations behind preview flag 5e86515 [#23768] YSQL: Fix table rewrite DDL before slot creation 123d496 [PLAT-14682] Universe task should only unlock itself and make unlock aware of the lock config de9d4ad [doc][yba] CIS hardened OS support (#23789) e131b20 [#23998] DocDB: Update usearch and other header-only third-party dependencies 1665662 Automatic commit by thirdparty_tool: update usearch to commit 240fe9c298100f9e37a2d7377b1595be6ba1f412. 3adbdae Automatic commit by thirdparty_tool: update fp16 to commit 98b0a46bce017382a6351a19577ec43a715b6835. 9a819f7 Automatic commit by thirdparty_tool: update hnswlib to commit 2142dc6f4dd08e64ab727a7bbd93be7f732e80b0. 2dc58f4 Automatic commit by thirdparty_tool: update simsimd to tag v5.1.0. 9a03432 [doc][ybm] Azure private link host (#24086) 039c9a2 [#17378] YSQL: Testing for histogram_bounds in pg_stats 09f7a0f [#24085] DocDB: Refactor HNSW wrappers 555af7d [#24000] DocDB: Shutting down shared exchange could cause TServer to hang 5743a03 [PLAT-15317]Alert emails are not in the correct format. 8642555 [PLAT-15379][Fix PLAT-12510] Option to use UTC when dealing with cron exp. in backup schedule 253ab07 [PLAT-15400][PLAT-15401][PLAT-13051] - Connection pooling ui issues and other ui issues 57576ae [#16487] YSQL: Fix flakey TestPostgresPid test bc8ae45 Update ports for CIS hardened (#24098) 6fa33e6 [#18152, #18729] Docdb: Fix test TestPgIndexSelectiveUpdate cc6d2d1 [docs] added and updated cves (#24046) Excluded: ed153dc [#24055] YSQL: fix pg_hint_plan regression with executing prepared statement Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: jason, jenkins-bot Differential Revision: https://phorge.dev.yugabyte.com/D38322

…ehind preview flag Summary: **Conflict Resolution for PG15 cherrypick** - src/postgres/src/include/executor/ybOptimizeModifyTable.h: - Location: declaration of function`YbComputeModifiedColumnsAndSkippableEntities` - My master commit removes the param `ModifyTable *plan`, and adds the param `ModifyTableState *mtstate` - YB-PG 15 (commit df51270) adds the param `ResultRelInfo *` - Merge resolution: Remove param `ModifyTable *plan`, add params `ModifyTableState *mtstate`, `ResultRelInfo *`. Lint appropriately. - src/postgres/src/backend/executor/ybOptimizeModifyTable.c: - Location: definition of function function`YbComputeModifiedColumnsAndSkippableEntities` - Same conflict, resolution as above. - src/postgres/src/backend/executor/nodeModifyTable.c: - Location: ExecInitModifyTable - My master commit adds logic to compute the field `mtstate->yb_is_update_optimization_enabled` - This causes an adjacency conflict with YB-PG 15 (55782d5**)** - Merge resolution: Keep YB-PG 15 side of changes, add a newline, add logic + comment to compute the field `mtstate->yb_is_update_optimization_enabled` from master commit, remove the rest of the master commit hunk. - Location: ExecModifyTable around `else if (relkind == RELKIND_RELATION ...` - My master commit changes the params to function `YBCHasWholeRowJunkAttr`. - This causes a conflict with YB-PG 15 because this function call no longer exists. Refer resolution notes in df51270 - Merge resolution: Keep YB-PG 15 side of changes. - Location: declaration of YBCHasWholeRowJunkAttr - Same context as above. - Merge resolution: Keep YB-PG 15 side of changes. - Location: ExecUpdate - My master commit has changed the params to function`YbComputeModifiedColumnsAndSkippableEntities`. Refer above. - In YB-PG 15, this function call has moved to function `YBExecUpdateAct`. - Merge resolution: - Keep YB-PG 15 side of changes. - In `YBExecUpdateAct`, change param `plan` to `context->mtstate` in function call to `YbComputeModifiedColumnsAndSkippableEntities`. D34040 introduced a framework to skip redundant secondary index updates and foreign key checks. This revision adds a PG preview flag `ysql_yb_update_optimization_infra` that keeps the feature turned OFF by default. Additionally, this revision restricts lookup to this flag to the following locations:. - Once, during query rewriting to determine to set the wholerow junk attribute - Once, at planning time to determine if optimization metadata needs to be computed - Once, at the beginning of query execution (`ExecInitModifyTable`) to determine if we should go ahead with the optimization. This has three possibilities: - The feature was enabled at planning time and disabled now (during query execution). In this case optimization is not performed. - The feature was disabled at planning time and enabled now. In this case, there is no metadata to perform the optimization and so the optimization is skipped. - The features was enabled at planning time and enabled now. In this case, metadata is available and the optimization is performed. To turn on the update optimization feature, do the following: - Append `ysql_yb_update_optimization_infra` to the list of flags in `allowed_preview_flags_csv` - Set the tserver gflag `ysql_yb_update_optimization_infra` to true. Jira: DB-7701 Original commit: 8c8adc0 / D37746 Test Plan: Run the following tests: ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule' ``` Reviewers: jason, tfoucher, telgersma Reviewed By: telgersma Subscribers: yql, smishra, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D38388

…iew flag Summary: D34040 introduced a framework to skip redundant secondary index updates and foreign key checks. This revision adds a PG preview flag `ysql_yb_update_optimization_infra` that keeps the feature turned OFF by default. Additionally, this revision restricts lookup to this flag to the following locations:. - Once, during query rewriting to determine to set the wholerow junk attribute - Once, at planning time to determine if optimization metadata needs to be computed - Once, at the beginning of query execution (`ExecInitModifyTable`) to determine if we should go ahead with the optimization. This has three possibilities: - The feature was enabled at planning time and disabled now (during query execution). In this case optimization is not performed. - The feature was disabled at planning time and enabled now. In this case, there is no metadata to perform the optimization and so the optimization is skipped. - The features was enabled at planning time and enabled now. In this case, metadata is available and the optimization is performed. To turn on the update optimization feature, do the following: - Append `ysql_yb_update_optimization_infra` to the list of flags in `allowed_preview_flags_csv` - Set the tserver gflag `ysql_yb_update_optimization_infra` to true. Jira: DB-7701 Original commit: 8c8adc0 / D37746 Test Plan: Run the following tests: ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule' ``` Reviewers: amartsinchyk Reviewed By: amartsinchyk Subscribers: yql, smishra, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D38289

Summary: D34040 introduced an optimization to skip redundant index updates and foreign key checks. D37746 gated this feature behind a preview flag (feature is OFF by default) named `yb_update_optimization_infra`. This revision: - Promotes the preview flag to an auto flag of the same name. The auto flag is tagged as kLocalVolatile (has no impact on on-disk data). - Introduces an additional gflag `ysql_yb_skip_redundant_update_ops` to allow the feature to be enabled/disabled. The feature will be enabled by default when: - (brownfield) The cluster is upgraded to a version that has the above autoflag promoted. - (greenfield) The cluster is of a version that has this change. The feature can be disabled: - Clusterwide by setting the gflag `ysql_yb_skip_redundant_update_ops` to false. - On a per-session basis by the setting postgres GUC `yb_skip_redundant_update_ops` to false. Jira: DB-7701 Test Plan: Run Jenkins Reviewers: smishra, amartsinchyk Reviewed By: smishra Subscribers: ybase, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D37749

Summary: 79a00fd [PLAT-15307]fix sensitive info leaks via Gflags cd26c93 [DOC-487] Voyager 1.8.2 changes (#24177) fa91de7 [docs] Apache Hudi integration with YSQL (#23888) 586d337 Updating DynamoDB comparison (#24216) aad5695 [#18822] YSQL: Promote autoflag to skip redundant update operations fa38152 Fix UBI image: Add -y option to install of hostname 6baf188 [#23998] Update third-party dependencies and enable SimSIMD in Usearch d57db29 Automatic commit by thirdparty_tool: update usearch to commit 191d9bb46fe5e2a44d1505ce7563ed51c7e55868. aab1a8b Automatic commit by thirdparty_tool: update simsimd to tag v5.4.3-yb-1. 161c0c8 [PLAT-15279] Adding unix timestamp to the core dump 17c45ff [#24217] YSQL: fill definition of a shell type requires catalog version increment 037fac0 [DB-13062] yugabyted: added banner and get started component 2eedabd [doc] Read replica connection load balancing support in JDBC Smart driver (#24006) 62a6a32 [#21467, #24153] Docdb: Add Read sequences as of time - sequences support for clone part 2 12de78e [PLAT-14954] added support for systemd-timesyncd 4a07eb8 [#23988] YSQL: Skip a table for schema comparison if its schema does not change d3fd39f [doc][ybm] Add reasoning behind no access to yugabyte user #21105 (#23930) 556ba8a [PLAT-15074] Install node agents on nodes for the marked universes for on-prem providers 9beb6dc [#22710][#22707] yugabyted: Update voyager migrations list landing page. (#22834) 6128137 [PLAT-15545] Simplify the frozen universe message for end user in YBA 4e36b78 JDBC Driver version update to 42.3.5-yb-8 (#24241) 254c979 [PLAT-15519]: Update xCluster sync to remove tables from YBA DB Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: tfoucher, fizaa, telgersma Differential Revision: https://phorge.dev.yugabyte.com/D38624

…date operations Summary: **Note for 2024.2: The optimization to skip redundant index updates and foreign key checks is turned OFF by default is 2024.2.** This has been accomplished by setting the gflag `ysql_yb_skip_redundant_update_ops` to false. To enable this optimization, set the gflag to true. **Original Summary** D34040 introduced an optimization to skip redundant index updates and foreign key checks. D37746 gated this feature behind a preview flag (feature is OFF by default) named `yb_update_optimization_infra`. This revision: - Promotes the preview flag to an auto flag of the same name. The auto flag is tagged as kLocalVolatile (has no impact on on-disk data). - Introduces an additional gflag `ysql_yb_skip_redundant_update_ops` to allow the feature to be enabled/disabled. The feature will be enabled by default when: - (brownfield) The cluster is upgraded to a version that has the above autoflag promoted. - (greenfield) The cluster is of a version that has this change. The feature can be disabled: - Clusterwide by setting the gflag `ysql_yb_skip_redundant_update_ops` to false. - On a per-session basis by the setting postgres GUC `yb_skip_redundant_update_ops` to false. Jira: DB-7701 Original commit: aad5695 / D37749 Test Plan: Run Jenkins Reviewers: smishra, amartsinchyk Reviewed By: smishra Subscribers: yql, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D38598

…s in 2024.2 by default Summary: Timeline: - D34040 introduced an optimization to skip redundant index updates and foreign key checks. - D37746 gated this feature behind a preview flag (feature is OFF by default) named `yb_update_optimization_infra`. - D37749 introduced a feature enablement flag `ysql_yb_skip_redundant_update_ops`, and promoted the above preview flag to an auto flag. - D38598 backported the changes of D37749 to 2024.2, while keeping the feature enablement flag turned OFF (the flag is on in master, by default) This revision turns on the feature by default in 2024.2 by turning ON the feature enablement flag `ysql_yb_skip_redundant_update_ops`. Jira: DB-7701 Test Plan: Run Jenkins Reviewers: smishra, amartsinchyk Reviewed By: smishra Subscribers: yql, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D38936

karthik-ramanathan-3006 added kind/enhancement This is an enhancement of an existing feature area/ysql Yugabyte SQL (YSQL) labels Aug 23, 2023

yugabyte-ci added the priority/medium Medium priority issue label Aug 23, 2023

karthik-ramanathan-3006 changed the title ~~[YSQL] Extra Secondary Index Lookups on Trigger Insert + Foreign Key Check~~ [YSQL] Extra Secondary Index Writes on Trigger Insert + Foreign Key Check Aug 23, 2023

karthik-ramanathan-3006 mentioned this issue Sep 7, 2023

[YSQL] Trigger + FK Constraint discards Single Row Optimization during UPDATE TABLE #19042

Open

1 task

yugabyte-ci assigned karthik-ramanathan-3006 Oct 4, 2023

karthik-ramanathan-3006 mentioned this issue Aug 29, 2024

[YSQL] Add more regress tests for update optimizations #23733

Open

1 task

sushantrmishra closed this as completed Sep 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] Extra Secondary Index Writes on Trigger Insert + Foreign Key Check #18822

[YSQL] Extra Secondary Index Writes on Trigger Insert + Foreign Key Check #18822

karthik-ramanathan-3006 commented Aug 23, 2023 •

edited by jira bot

Loading

karthik-ramanathan-3006 commented Sep 7, 2023

[YSQL] Extra Secondary Index Writes on Trigger Insert + Foreign Key Check #18822

[YSQL] Extra Secondary Index Writes on Trigger Insert + Foreign Key Check #18822

Comments

karthik-ramanathan-3006 commented Aug 23, 2023 • edited by jira bot Loading

Description

Warning: Please confirm that this issue does not contain any sensitive information

karthik-ramanathan-3006 commented Sep 7, 2023

karthik-ramanathan-3006 commented Aug 23, 2023 •

edited by jira bot

Loading