Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Clone sequences for a given database as part of Clone Database #21467

Closed
yugabyte-ci opened this issue Mar 14, 2024 · 4 comments
Closed
Assignees
Labels
2024.2_blocker area/docdb YugabyteDB core features jira-originated kind/enhancement This is an enhancement of an existing feature priority/high High Priority

Comments

@yugabyte-ci
Copy link
Contributor

yugabyte-ci commented Mar 14, 2024

Jira Link: DB-10350

Currently, clone doesn't support sequences in YSQL database. That is, if we try to clone a database which has a sequence, cloning will fail with the following error:
ERROR: Clone operation aborted: Not found (yb/tablet/tablet_metadata.cc:781): Failed to list snapshot: c0d01ed0-0344-49cb-9e69-386b8e75b558: Table <unknown_table_name> (0000ffff0000300080000000000004eb) not found in Raft group 00000000000000000000000000000000

@yugabyte-ci yugabyte-ci added area/docdb YugabyteDB core features jira-originated kind/enhancement This is an enhancement of an existing feature priority/low Low priority labels Mar 14, 2024
@yugabyte-ci yugabyte-ci added priority/medium Medium priority issue and removed priority/low Low priority labels May 16, 2024
@yamen-haddad yamen-haddad added priority/high High Priority and removed priority/medium Medium priority issue labels Sep 10, 2024
@yamen-haddad
Copy link
Member

The clone is failing in the Generate Snapshot Info as of time step. More specifically, clone repacks the SnapshotInfo of the suitable snapshot to be consumed in the restore operation.
However, as the snapshot repacked by clone is part of a snapshot schedule, it contains the sequences_data table entry i.e: table: 0000ffff0000300080000000000004eb . However, the repacking step doesn't expect the sequences_table to be inside the snapshot schedule and thus throws the previous error.
Cloning sequences is one of the situations where clone diverge from PITR. In clone, we use Backup/Restore flow to restore sequences while in PITR we use the snapshot of the sequnces_data table to get the correct value at restore time. Therefore, we should ignore the sequences_data table and related entries from the snapshotInfo while repacking.

yamen-haddad added a commit that referenced this issue Sep 23, 2024
Summary:
Currently, clone doesn't support sequences in YSQL database. That is, if we try to clone a database which has a sequence, cloning will fail with the following error:
```
ERROR:  Clone operation aborted: Not found (yb/tablet/tablet_metadata.cc:781): Failed to list snapshot: c0d01ed0-0344-49cb-9e69-386b8e75b558: Table <unknown_table_name> (0000ffff0000300080000000000004eb) not found in Raft group 00000000000000000000000000000000
```
The clone was failing in the Generate SnapshotInfo as of time step. More specifically, clone repacks the SnapshotInfo of the suitable snapshot to be consumed in the restore operation.
However, as the snapshot repacked by clone is part of a snapshot schedule, it contains the `sequences_data` table entry i.e: table: `0000ffff0000300080000000000004eb`. The repacking step doesn't expect the `sequences_table` to be inside the snapshot and thus throws the previous error.
Cloning sequences is one of the situations where clone diverge from PITR. In clone, we use Backup/Restore flow to restore sequences while in PITR we use the snapshot of the `sequences_data` table to get the correct value at restore time. The diff fixes the issue by skipping the repacking of the `sequences_data` table and related tablets from the snapshotInfo. The assumption is that we never need to repack the `sequences_data`. In reality, `sequences_data` table only exists in snapshots that are part of a snapshot schedule.
Jira: DB-10350, DB-10673, DB-12578

Test Plan:
./yb_build.sh --cxx-test integration-tests_minicluster-snapshot-test --gtest_filter Colocation/PgCloneTestWithColocatedDBParam.CloneWithSequences/0
./yb_build.sh --cxx-test integration-tests_minicluster-snapshot-test --gtest_filter Colocation/PgCloneTestWithColocatedDBParam.CloneWithSequences/1

Also Enabled the following PITR tests for clone and all of them passed after this diff:
PgsqlSequenceUndoDeletedData
PgsqlSequenceUndoInsertedData
PgsqlSequenceUndoCreateSequence

Example to run one of them:
./yb_build.sh --cxx-test yb-admin-snapshot-schedule-test --gtest_filter ColocationAndRestoreType/YbAdminSnapshotScheduleTestWithYsqlColocationRestoreParam.PgsqlSequenceUndoDeletedData/3
3 and 4 are the parameters for clone (non-colocation and colocation respectively)

Reviewers: asrivastava

Reviewed By: asrivastava

Subscribers: slingam, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D38186
@yamen-haddad
Copy link
Member

After the first part of supporting sequences in clone, users should be able to clone a database that has sequences in it.

The only limitation we have today is that users cannot clone to a point in time before a drop sequences operation has been performed. For example, if the user drops a table that has a serial column, the user cannot clone to a time before the drop table.

The reason for the failure is that ysql_dump is unable to get the last_value of the dropped sequence. This is because the sequence is dropped and we cannot perform read as of time to the system_postgres.sequences_data table.

We are discussing two paths forward:

  • Unblock the clone, by setting last_value to 1 or the mid of range value. This unblocks users who are using the clone to read deleted data. The downside is that they might get errors when trying to insert data using the cloned sequence as the generated next values might already have been used before. The error will disappear once the last_values reaches a unique value.
  • Add support to read as of time from the system_postgres.sequences_data

timothy-e pushed a commit that referenced this issue Sep 24, 2024
Summary:
 8916c1d [#21467,#21783, #23667] Docdb: Add sequences support for clone part 1
 b4a0b45 [PLAT-14944] Enable db scoped replication if runtime config is enabled
 b062c44 Update link for crd file (#24083)
 d3f0da5 [PLAT-14898] Record reason for failing to enable node agent
 b135dfc update images (#24104)
 846e35f [PLAT-14597] Fetch xCluster replication/DR configs with and without extra table info
 1bc9e06 [#24091] xClusterDDLRepl: Pass automatic_ddl_mode to pollers
 afc424d [#23824] YSQL: Fix crash for when a RowComparisonExpression is used on a reordered primary key index
 76a5c97 [#23943] YSQL: Fix Bitmap Scan GCC11 crash (follow-up)
 fd23b15 [#24040] YSQL: Simplify the PatchStatus function
 47a4723 [PLAT-15441] Pin golang package version in build.sh to prevent incompatible versions to be installed
 1edfa4c [PLAT-12224][PLAT-15238] Add metric for connection pooling

Test Plan: Jenkins: rebase: pg15-cherrypicks

Reviewers: jason, tfoucher

Subscribers: telgersma

Differential Revision: https://phorge.dev.yugabyte.com/D38365
yamen-haddad added a commit that referenced this issue Sep 25, 2024
… for clone part 1

Summary:
Original commit: 8916c1d / D38186
Currently, clone doesn't support sequences in YSQL database. That is, if we try to clone a database which has a sequence, cloning will fail with the following error:
```
ERROR:  Clone operation aborted: Not found (yb/tablet/tablet_metadata.cc:781): Failed to list snapshot: c0d01ed0-0344-49cb-9e69-386b8e75b558: Table <unknown_table_name> (0000ffff0000300080000000000004eb) not found in Raft group 00000000000000000000000000000000
```
The clone was failing in the Generate SnapshotInfo as of time step. More specifically, clone repacks the SnapshotInfo of the suitable snapshot to be consumed in the restore operation.
However, as the snapshot repacked by clone is part of a snapshot schedule, it contains the `sequences_data` table entry i.e: table: `0000ffff0000300080000000000004eb`. The repacking step doesn't expect the `sequences_table` to be inside the snapshot and thus throws the previous error.
Cloning sequences is one of the situations where clone diverge from PITR. In clone, we use Backup/Restore flow to restore sequences while in PITR we use the snapshot of the `sequences_data` table to get the correct value at restore time. The diff fixes the issue by skipping the repacking of the `sequences_data` table and related tablets from the snapshotInfo. The assumption is that we never need to repack the `sequences_data`. In reality, `sequences_data` table only exists in snapshots that are part of a snapshot schedule.
Jira: DB-10350, DB-10673, DB-12578

Test Plan:
./yb_build.sh --cxx-test integration-tests_minicluster-snapshot-test --gtest_filter Colocation/PgCloneTestWithColocatedDBParam.CloneWithSequences/0
./yb_build.sh --cxx-test integration-tests_minicluster-snapshot-test --gtest_filter Colocation/PgCloneTestWithColocatedDBParam.CloneWithSequences/1

Also Enabled the following PITR tests for clone and all of them passed after this diff:
PgsqlSequenceUndoDeletedData
PgsqlSequenceUndoInsertedData
PgsqlSequenceUndoCreateSequence

Example to run one of them:
./yb_build.sh --cxx-test yb-admin-snapshot-schedule-test --gtest_filter ColocationAndRestoreType/YbAdminSnapshotScheduleTestWithYsqlColocationRestoreParam.PgsqlSequenceUndoDeletedData/3
3 and 4 are the parameters for clone (non-colocation and colocation respectively)

Reviewers: asrivastava

Reviewed By: asrivastava

Subscribers: ybase, slingam

Differential Revision: https://phorge.dev.yugabyte.com/D38375
yamen-haddad added a commit that referenced this issue Oct 1, 2024
…port for clone part 2

Summary:
Currently, if we try to clone to a time before a drop sequence happened on the original table, clone fails with the following error:
```
ERROR:  Unable to find relation for sequence 16384
```

The reason for the failure is that ysql_dump is unable to get the last_value of the dropped sequence. This is because the sequence is dropped and we cannot perform read as of time to the `system_postgres.sequences_data` table.

This diff adds the ability to **read** the sequences_data table as of a point in time in the past using the `yb_read_time` GUC variable. The yb_read_time didn't cover reading sequences in the past, as sequences operations use an independent YBSession and different RPCs than the `Perform` RPC used in most other read/write operations. This diff extends yb_read_time GUC to cover `ReadSequenceTuple` RPC. An example of usage:

```
CREATE SEQUENCE seq_1;
db1=# SELECT * FROM seq_1;
 last_value | log_cnt | is_called
------------+---------+-----------
          1 |       0 | f

db1=# SELECT (EXTRACT (EPOCH FROM CURRENT_TIMESTAMP)*1000000)::decimal(38,0);
     numeric
------------------
 1727315135780060

db1=# SELECT nextval('seq_1');
 nextval
---------
       1

db1=# SELECT * FROM seq_1;
 last_value | log_cnt | is_called
------------+---------+-----------
        100 |       0 | t

db1=# SET yb_read_time TO 1727315135780060;
SET
db1=# SELECT * FROM seq_1;
 last_value | log_cnt | is_called
------------+---------+-----------
          1 |       0 | f
```
**Upgrade/Rollback safety:**
Only adding in a new optional field `read_time` to `PgReadSequenceTupleRequestPB`.
- If an old message is sent to a service with the new format, the read_time field is set to 0 "default" and the read request will be executed as of the latest time (the default behaviour before this diff).
- If a new message is sent to a service with an older message format, the read_time field will be ignored and the read happens as of the current time.
Knowing that this message only affects pure read requests, no undesired inconsistency is introduced.

Jira: DB-10350, DB-13040

Test Plan:
./yb_build.sh --cxx-test integration-tests_sequence_utility-itest --gtest_filter SequencesUtilTest.ReadSequencesAsOfTime

Also Enabled the following PITR tests for clone and all of them passed after this diff:
PgsqlSequenceUndoDropSequence
PgsqlSequenceVerifyPartialRestore
PgsqlSequencePartialCleanupAfterRestore

Example to run one of them:
./yb_build.sh --cxx-test yb-admin-snapshot-schedule-test --gtest_filter ColocationAndRestoreType/YbAdminSnapshotScheduleTestWithYsqlColocationRestoreParam.PgsqlSequenceUndoDropSequence/3

3 and 4 are the parameters for clone (non-colocation and colocation respectively)

Reviewers: hsunder, asrivastava, mlillibridge

Reviewed By: asrivastava

Subscribers: yql, slingam, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D38392
@yamen-haddad
Copy link
Member

Closing this as the two revisions landed in this thread should cover all the cases of sequences.
@Arjun-yb To reopen in case any issue is observed.

@yamen-haddad
Copy link
Member

Reopening as I forgot to backport to 2024.2

@yamen-haddad yamen-haddad reopened this Oct 1, 2024
timothy-e pushed a commit that referenced this issue Oct 2, 2024
Summary:
 79a00fd [PLAT-15307]fix sensitive info leaks via Gflags
 cd26c93 [DOC-487] Voyager 1.8.2 changes (#24177)
 fa91de7 [docs] Apache Hudi integration with YSQL  (#23888)
 586d337 Updating DynamoDB comparison (#24216)
 aad5695 [#18822] YSQL: Promote autoflag to skip redundant update operations
 fa38152 Fix UBI image: Add -y option to install of hostname
 6baf188 [#23998] Update third-party dependencies and enable SimSIMD in Usearch
 d57db29 Automatic commit by thirdparty_tool: update usearch to commit 191d9bb46fe5e2a44d1505ce7563ed51c7e55868.
 aab1a8b Automatic commit by thirdparty_tool: update simsimd to tag v5.4.3-yb-1.
 161c0c8 [PLAT-15279] Adding unix timestamp to the core dump
 17c45ff [#24217] YSQL: fill definition of a shell type requires catalog version increment
 037fac0 [DB-13062] yugabyted: added banner and get started component
 2eedabd [doc] Read replica connection load balancing support in JDBC Smart driver (#24006)
 62a6a32 [#21467, #24153] Docdb: Add Read sequences as of time - sequences support for clone part 2
 12de78e [PLAT-14954] added support for systemd-timesyncd
 4a07eb8 [#23988] YSQL: Skip a table for schema comparison if its schema does not change
 d3fd39f [doc][ybm] Add reasoning behind no access to yugabyte user #21105 (#23930)
 556ba8a [PLAT-15074] Install node agents on nodes for the marked universes for on-prem providers
 9beb6dc [#22710][#22707] yugabyted: Update voyager migrations list landing page. (#22834)
 6128137 [PLAT-15545] Simplify the frozen universe message for end user in YBA
 4e36b78 JDBC Driver version update to 42.3.5-yb-8 (#24241)
 254c979 [PLAT-15519]: Update xCluster sync to remove tables from YBA DB

Test Plan: Jenkins: rebase: pg15-cherrypicks

Reviewers: tfoucher, fizaa, telgersma

Differential Revision: https://phorge.dev.yugabyte.com/D38624
yamen-haddad added a commit that referenced this issue Oct 14, 2024
…e - sequences support for clone part 2

Summary:
Original commit: 62a6a32 / D38392
Currently, if we try to clone to a time before a drop sequence happened on the original table, clone fails with the following error:
```
ERROR:  Unable to find relation for sequence 16384
```

The reason for the failure is that ysql_dump is unable to get the last_value of the dropped sequence. This is because the sequence is dropped and we cannot perform read as of time to the `system_postgres.sequences_data` table.

This diff adds the ability to **read** the sequences_data table as of a point in time in the past using the `yb_read_time` GUC variable. The yb_read_time didn't cover reading sequences in the past, as sequences operations use an independent YBSession and different RPCs than the `Perform` RPC used in most other read/write operations. This diff extends yb_read_time GUC to cover `ReadSequenceTuple` RPC. An example of usage:

```
CREATE SEQUENCE seq_1;
db1=# SELECT * FROM seq_1;
 last_value | log_cnt | is_called
------------+---------+-----------
          1 |       0 | f

db1=# SELECT (EXTRACT (EPOCH FROM CURRENT_TIMESTAMP)*1000000)::decimal(38,0);
     numeric
------------------
 1727315135780060

db1=# SELECT nextval('seq_1');
 nextval
---------
       1

db1=# SELECT * FROM seq_1;
 last_value | log_cnt | is_called
------------+---------+-----------
        100 |       0 | t

db1=# SET yb_read_time TO 1727315135780060;
SET
db1=# SELECT * FROM seq_1;
 last_value | log_cnt | is_called
------------+---------+-----------
          1 |       0 | f
```
**Upgrade/Rollback safety:**
Only adding in a new optional field `read_time` to `PgReadSequenceTupleRequestPB`.
- If an old message is sent to a service with the new format, the read_time field is set to 0 "default" and the read request will be executed as of the latest time (the default behaviour before this diff).
- If a new message is sent to a service with an older message format, the read_time field will be ignored and the read happens as of the current time.
Knowing that this message only affects pure read requests, no undesired inconsistency is introduced.

Jira: DB-10350, DB-13040

Test Plan:
./yb_build.sh --cxx-test integration-tests_sequence_utility-itest --gtest_filter SequencesUtilTest.ReadSequencesAsOfTime

Also Enabled the following PITR tests for clone and all of them passed after this diff:
PgsqlSequenceUndoDropSequence
PgsqlSequenceVerifyPartialRestore
PgsqlSequencePartialCleanupAfterRestore

Example to run one of them:
./yb_build.sh --cxx-test yb-admin-snapshot-schedule-test --gtest_filter ColocationAndRestoreType/YbAdminSnapshotScheduleTestWithYsqlColocationRestoreParam.PgsqlSequenceUndoDropSequence/3

3 and 4 are the parameters for clone (non-colocation and colocation respectively)

Reviewers: hsunder, asrivastava, mlillibridge

Reviewed By: asrivastava

Subscribers: ybase, slingam, yql

Differential Revision: https://phorge.dev.yugabyte.com/D38617
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024.2_blocker area/docdb YugabyteDB core features jira-originated kind/enhancement This is an enhancement of an existing feature priority/high High Priority
Projects
None yet
Development

No branches or pull requests

2 participants