[DocDB] Clone sequences for a given database as part of Clone Database #21467

yugabyte-ci · 2024-03-14T04:55:58Z

Currently, clone doesn't support sequences in YSQL database. That is, if we try to clone a database which has a sequence, cloning will fail with the following error:
ERROR: Clone operation aborted: Not found (yb/tablet/tablet_metadata.cc:781): Failed to list snapshot: c0d01ed0-0344-49cb-9e69-386b8e75b558: Table <unknown_table_name> (0000ffff0000300080000000000004eb) not found in Raft group 00000000000000000000000000000000

The text was updated successfully, but these errors were encountered:

yamen-haddad · 2024-09-18T01:11:31Z

The clone is failing in the Generate Snapshot Info as of time step. More specifically, clone repacks the SnapshotInfo of the suitable snapshot to be consumed in the restore operation.
However, as the snapshot repacked by clone is part of a snapshot schedule, it contains the sequences_data table entry i.e: table: 0000ffff0000300080000000000004eb . However, the repacking step doesn't expect the sequences_table to be inside the snapshot schedule and thus throws the previous error.
Cloning sequences is one of the situations where clone diverge from PITR. In clone, we use Backup/Restore flow to restore sequences while in PITR we use the snapshot of the sequnces_data table to get the correct value at restore time. Therefore, we should ignore the sequences_data table and related entries from the snapshotInfo while repacking.

Summary: Currently, clone doesn't support sequences in YSQL database. That is, if we try to clone a database which has a sequence, cloning will fail with the following error: ``` ERROR: Clone operation aborted: Not found (yb/tablet/tablet_metadata.cc:781): Failed to list snapshot: c0d01ed0-0344-49cb-9e69-386b8e75b558: Table <unknown_table_name> (0000ffff0000300080000000000004eb) not found in Raft group 00000000000000000000000000000000 ``` The clone was failing in the Generate SnapshotInfo as of time step. More specifically, clone repacks the SnapshotInfo of the suitable snapshot to be consumed in the restore operation. However, as the snapshot repacked by clone is part of a snapshot schedule, it contains the `sequences_data` table entry i.e: table: `0000ffff0000300080000000000004eb`. The repacking step doesn't expect the `sequences_table` to be inside the snapshot and thus throws the previous error. Cloning sequences is one of the situations where clone diverge from PITR. In clone, we use Backup/Restore flow to restore sequences while in PITR we use the snapshot of the `sequences_data` table to get the correct value at restore time. The diff fixes the issue by skipping the repacking of the `sequences_data` table and related tablets from the snapshotInfo. The assumption is that we never need to repack the `sequences_data`. In reality, `sequences_data` table only exists in snapshots that are part of a snapshot schedule. Jira: DB-10350, DB-10673, DB-12578 Test Plan: ./yb_build.sh --cxx-test integration-tests_minicluster-snapshot-test --gtest_filter Colocation/PgCloneTestWithColocatedDBParam.CloneWithSequences/0 ./yb_build.sh --cxx-test integration-tests_minicluster-snapshot-test --gtest_filter Colocation/PgCloneTestWithColocatedDBParam.CloneWithSequences/1 Also Enabled the following PITR tests for clone and all of them passed after this diff: PgsqlSequenceUndoDeletedData PgsqlSequenceUndoInsertedData PgsqlSequenceUndoCreateSequence Example to run one of them: ./yb_build.sh --cxx-test yb-admin-snapshot-schedule-test --gtest_filter ColocationAndRestoreType/YbAdminSnapshotScheduleTestWithYsqlColocationRestoreParam.PgsqlSequenceUndoDeletedData/3 3 and 4 are the parameters for clone (non-colocation and colocation respectively) Reviewers: asrivastava Reviewed By: asrivastava Subscribers: slingam, ybase Differential Revision: https://phorge.dev.yugabyte.com/D38186

yamen-haddad · 2024-09-24T03:54:33Z

After the first part of supporting sequences in clone, users should be able to clone a database that has sequences in it.

The only limitation we have today is that users cannot clone to a point in time before a drop sequences operation has been performed. For example, if the user drops a table that has a serial column, the user cannot clone to a time before the drop table.

The reason for the failure is that ysql_dump is unable to get the last_value of the dropped sequence. This is because the sequence is dropped and we cannot perform read as of time to the system_postgres.sequences_data table.

We are discussing two paths forward:

Unblock the clone, by setting last_value to 1 or the mid of range value. This unblocks users who are using the clone to read deleted data. The downside is that they might get errors when trying to insert data using the cloned sequence as the generated next values might already have been used before. The error will disappear once the last_values reaches a unique value.
Add support to read as of time from the system_postgres.sequences_data

Summary: 8916c1d [#21467,#21783, #23667] Docdb: Add sequences support for clone part 1 b4a0b45 [PLAT-14944] Enable db scoped replication if runtime config is enabled b062c44 Update link for crd file (#24083) d3f0da5 [PLAT-14898] Record reason for failing to enable node agent b135dfc update images (#24104) 846e35f [PLAT-14597] Fetch xCluster replication/DR configs with and without extra table info 1bc9e06 [#24091] xClusterDDLRepl: Pass automatic_ddl_mode to pollers afc424d [#23824] YSQL: Fix crash for when a RowComparisonExpression is used on a reordered primary key index 76a5c97 [#23943] YSQL: Fix Bitmap Scan GCC11 crash (follow-up) fd23b15 [#24040] YSQL: Simplify the PatchStatus function 47a4723 [PLAT-15441] Pin golang package version in build.sh to prevent incompatible versions to be installed 1edfa4c [PLAT-12224][PLAT-15238] Add metric for connection pooling Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: jason, tfoucher Subscribers: telgersma Differential Revision: https://phorge.dev.yugabyte.com/D38365

… for clone part 1 Summary: Original commit: 8916c1d / D38186 Currently, clone doesn't support sequences in YSQL database. That is, if we try to clone a database which has a sequence, cloning will fail with the following error: ``` ERROR: Clone operation aborted: Not found (yb/tablet/tablet_metadata.cc:781): Failed to list snapshot: c0d01ed0-0344-49cb-9e69-386b8e75b558: Table <unknown_table_name> (0000ffff0000300080000000000004eb) not found in Raft group 00000000000000000000000000000000 ``` The clone was failing in the Generate SnapshotInfo as of time step. More specifically, clone repacks the SnapshotInfo of the suitable snapshot to be consumed in the restore operation. However, as the snapshot repacked by clone is part of a snapshot schedule, it contains the `sequences_data` table entry i.e: table: `0000ffff0000300080000000000004eb`. The repacking step doesn't expect the `sequences_table` to be inside the snapshot and thus throws the previous error. Cloning sequences is one of the situations where clone diverge from PITR. In clone, we use Backup/Restore flow to restore sequences while in PITR we use the snapshot of the `sequences_data` table to get the correct value at restore time. The diff fixes the issue by skipping the repacking of the `sequences_data` table and related tablets from the snapshotInfo. The assumption is that we never need to repack the `sequences_data`. In reality, `sequences_data` table only exists in snapshots that are part of a snapshot schedule. Jira: DB-10350, DB-10673, DB-12578 Test Plan: ./yb_build.sh --cxx-test integration-tests_minicluster-snapshot-test --gtest_filter Colocation/PgCloneTestWithColocatedDBParam.CloneWithSequences/0 ./yb_build.sh --cxx-test integration-tests_minicluster-snapshot-test --gtest_filter Colocation/PgCloneTestWithColocatedDBParam.CloneWithSequences/1 Also Enabled the following PITR tests for clone and all of them passed after this diff: PgsqlSequenceUndoDeletedData PgsqlSequenceUndoInsertedData PgsqlSequenceUndoCreateSequence Example to run one of them: ./yb_build.sh --cxx-test yb-admin-snapshot-schedule-test --gtest_filter ColocationAndRestoreType/YbAdminSnapshotScheduleTestWithYsqlColocationRestoreParam.PgsqlSequenceUndoDeletedData/3 3 and 4 are the parameters for clone (non-colocation and colocation respectively) Reviewers: asrivastava Reviewed By: asrivastava Subscribers: ybase, slingam Differential Revision: https://phorge.dev.yugabyte.com/D38375

…port for clone part 2 Summary: Currently, if we try to clone to a time before a drop sequence happened on the original table, clone fails with the following error: ``` ERROR: Unable to find relation for sequence 16384 ``` The reason for the failure is that ysql_dump is unable to get the last_value of the dropped sequence. This is because the sequence is dropped and we cannot perform read as of time to the `system_postgres.sequences_data` table. This diff adds the ability to **read** the sequences_data table as of a point in time in the past using the `yb_read_time` GUC variable. The yb_read_time didn't cover reading sequences in the past, as sequences operations use an independent YBSession and different RPCs than the `Perform` RPC used in most other read/write operations. This diff extends yb_read_time GUC to cover `ReadSequenceTuple` RPC. An example of usage: ``` CREATE SEQUENCE seq_1; db1=# SELECT * FROM seq_1; last_value | log_cnt | is_called ------------+---------+----------- 1 | 0 | f db1=# SELECT (EXTRACT (EPOCH FROM CURRENT_TIMESTAMP)*1000000)::decimal(38,0); numeric ------------------ 1727315135780060 db1=# SELECT nextval('seq_1'); nextval --------- 1 db1=# SELECT * FROM seq_1; last_value | log_cnt | is_called ------------+---------+----------- 100 | 0 | t db1=# SET yb_read_time TO 1727315135780060; SET db1=# SELECT * FROM seq_1; last_value | log_cnt | is_called ------------+---------+----------- 1 | 0 | f ``` **Upgrade/Rollback safety:** Only adding in a new optional field `read_time` to `PgReadSequenceTupleRequestPB`. - If an old message is sent to a service with the new format, the read_time field is set to 0 "default" and the read request will be executed as of the latest time (the default behaviour before this diff). - If a new message is sent to a service with an older message format, the read_time field will be ignored and the read happens as of the current time. Knowing that this message only affects pure read requests, no undesired inconsistency is introduced. Jira: DB-10350, DB-13040 Test Plan: ./yb_build.sh --cxx-test integration-tests_sequence_utility-itest --gtest_filter SequencesUtilTest.ReadSequencesAsOfTime Also Enabled the following PITR tests for clone and all of them passed after this diff: PgsqlSequenceUndoDropSequence PgsqlSequenceVerifyPartialRestore PgsqlSequencePartialCleanupAfterRestore Example to run one of them: ./yb_build.sh --cxx-test yb-admin-snapshot-schedule-test --gtest_filter ColocationAndRestoreType/YbAdminSnapshotScheduleTestWithYsqlColocationRestoreParam.PgsqlSequenceUndoDropSequence/3 3 and 4 are the parameters for clone (non-colocation and colocation respectively) Reviewers: hsunder, asrivastava, mlillibridge Reviewed By: asrivastava Subscribers: yql, slingam, ybase Differential Revision: https://phorge.dev.yugabyte.com/D38392

yamen-haddad · 2024-10-01T22:13:44Z

Closing this as the two revisions landed in this thread should cover all the cases of sequences.
@Arjun-yb To reopen in case any issue is observed.

yamen-haddad · 2024-10-01T22:19:02Z

Reopening as I forgot to backport to 2024.2

Summary: 79a00fd [PLAT-15307]fix sensitive info leaks via Gflags cd26c93 [DOC-487] Voyager 1.8.2 changes (#24177) fa91de7 [docs] Apache Hudi integration with YSQL (#23888) 586d337 Updating DynamoDB comparison (#24216) aad5695 [#18822] YSQL: Promote autoflag to skip redundant update operations fa38152 Fix UBI image: Add -y option to install of hostname 6baf188 [#23998] Update third-party dependencies and enable SimSIMD in Usearch d57db29 Automatic commit by thirdparty_tool: update usearch to commit 191d9bb46fe5e2a44d1505ce7563ed51c7e55868. aab1a8b Automatic commit by thirdparty_tool: update simsimd to tag v5.4.3-yb-1. 161c0c8 [PLAT-15279] Adding unix timestamp to the core dump 17c45ff [#24217] YSQL: fill definition of a shell type requires catalog version increment 037fac0 [DB-13062] yugabyted: added banner and get started component 2eedabd [doc] Read replica connection load balancing support in JDBC Smart driver (#24006) 62a6a32 [#21467, #24153] Docdb: Add Read sequences as of time - sequences support for clone part 2 12de78e [PLAT-14954] added support for systemd-timesyncd 4a07eb8 [#23988] YSQL: Skip a table for schema comparison if its schema does not change d3fd39f [doc][ybm] Add reasoning behind no access to yugabyte user #21105 (#23930) 556ba8a [PLAT-15074] Install node agents on nodes for the marked universes for on-prem providers 9beb6dc [#22710][#22707] yugabyted: Update voyager migrations list landing page. (#22834) 6128137 [PLAT-15545] Simplify the frozen universe message for end user in YBA 4e36b78 JDBC Driver version update to 42.3.5-yb-8 (#24241) 254c979 [PLAT-15519]: Update xCluster sync to remove tables from YBA DB Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: tfoucher, fizaa, telgersma Differential Revision: https://phorge.dev.yugabyte.com/D38624

…e - sequences support for clone part 2 Summary: Original commit: 62a6a32 / D38392 Currently, if we try to clone to a time before a drop sequence happened on the original table, clone fails with the following error: ``` ERROR: Unable to find relation for sequence 16384 ``` The reason for the failure is that ysql_dump is unable to get the last_value of the dropped sequence. This is because the sequence is dropped and we cannot perform read as of time to the `system_postgres.sequences_data` table. This diff adds the ability to **read** the sequences_data table as of a point in time in the past using the `yb_read_time` GUC variable. The yb_read_time didn't cover reading sequences in the past, as sequences operations use an independent YBSession and different RPCs than the `Perform` RPC used in most other read/write operations. This diff extends yb_read_time GUC to cover `ReadSequenceTuple` RPC. An example of usage: ``` CREATE SEQUENCE seq_1; db1=# SELECT * FROM seq_1; last_value | log_cnt | is_called ------------+---------+----------- 1 | 0 | f db1=# SELECT (EXTRACT (EPOCH FROM CURRENT_TIMESTAMP)*1000000)::decimal(38,0); numeric ------------------ 1727315135780060 db1=# SELECT nextval('seq_1'); nextval --------- 1 db1=# SELECT * FROM seq_1; last_value | log_cnt | is_called ------------+---------+----------- 100 | 0 | t db1=# SET yb_read_time TO 1727315135780060; SET db1=# SELECT * FROM seq_1; last_value | log_cnt | is_called ------------+---------+----------- 1 | 0 | f ``` **Upgrade/Rollback safety:** Only adding in a new optional field `read_time` to `PgReadSequenceTupleRequestPB`. - If an old message is sent to a service with the new format, the read_time field is set to 0 "default" and the read request will be executed as of the latest time (the default behaviour before this diff). - If a new message is sent to a service with an older message format, the read_time field will be ignored and the read happens as of the current time. Knowing that this message only affects pure read requests, no undesired inconsistency is introduced. Jira: DB-10350, DB-13040 Test Plan: ./yb_build.sh --cxx-test integration-tests_sequence_utility-itest --gtest_filter SequencesUtilTest.ReadSequencesAsOfTime Also Enabled the following PITR tests for clone and all of them passed after this diff: PgsqlSequenceUndoDropSequence PgsqlSequenceVerifyPartialRestore PgsqlSequencePartialCleanupAfterRestore Example to run one of them: ./yb_build.sh --cxx-test yb-admin-snapshot-schedule-test --gtest_filter ColocationAndRestoreType/YbAdminSnapshotScheduleTestWithYsqlColocationRestoreParam.PgsqlSequenceUndoDropSequence/3 3 and 4 are the parameters for clone (non-colocation and colocation respectively) Reviewers: hsunder, asrivastava, mlillibridge Reviewed By: asrivastava Subscribers: ybase, slingam, yql Differential Revision: https://phorge.dev.yugabyte.com/D38617

yugabyte-ci added area/docdb YugabyteDB core features jira-originated kind/enhancement This is an enhancement of an existing feature priority/low Low priority labels Mar 14, 2024

yugabyte-ci assigned yamen-haddad Mar 14, 2024

yugabyte-ci added priority/medium Medium priority issue and removed priority/low Low priority labels May 16, 2024

yamen-haddad added priority/high High Priority and removed priority/medium Medium priority issue labels Sep 10, 2024

yugabyte-ci added the 2024.2_blocker label Sep 11, 2024

yamen-haddad closed this as completed Oct 1, 2024

yamen-haddad reopened this Oct 1, 2024

yamen-haddad closed this as completed Oct 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DocDB] Clone sequences for a given database as part of Clone Database #21467

[DocDB] Clone sequences for a given database as part of Clone Database #21467

yugabyte-ci commented Mar 14, 2024 •

edited by yamen-haddad

Loading

yamen-haddad commented Sep 18, 2024

yamen-haddad commented Sep 24, 2024

yamen-haddad commented Oct 1, 2024

yamen-haddad commented Oct 1, 2024

[DocDB] Clone sequences for a given database as part of Clone Database #21467

[DocDB] Clone sequences for a given database as part of Clone Database #21467

Comments

yugabyte-ci commented Mar 14, 2024 • edited by yamen-haddad Loading

yamen-haddad commented Sep 18, 2024

yamen-haddad commented Sep 24, 2024

yamen-haddad commented Oct 1, 2024

yamen-haddad commented Oct 1, 2024

yugabyte-ci commented Mar 14, 2024 •

edited by yamen-haddad

Loading