[docdb] PITR: Implement delete_schedule and edit_schedule APIs #8417

bmatican · 2021-05-12T20:13:48Z

Jira Link: DB-2166
For edit, it would be tricky to expose modifying the filter, as we might expand to capture tables not previously captured, for which we wouldn't have old snapshots. Given that, let's just have an ability to change interval and retention.

For delete, we need to confirm that all relevant GC work is done appropriately

TS side snapshots are removed
hidden tablets are removed, if needed

The text was updated successfully, but these errors were encountered:

Summary: This diff adds the ability to delete snapshot schedules. There is a tricky part in detecting deleted schedules on the TServer side. Suppose at some point TSTabletManager does not know schedule1. Since it knows schedules received by heartbeat there are 2 possible options: 1) It is a new schedule that was not yet received by heartbeat. 2) It is deleted schedule. So to distinguish between those 2 cases it uses the following logic. Each time the heartbeat response is received and the schedules list is updated, we also increment the `snapshot_schedules_version_` field. All missing schedules are added to the special map, along with the current value of `snapshot_schedules_version_`. So when we again find the schedule as missing, we could compare the current `snapshot_schedules_version_` and version that we had when the schedule was first found as missing. So if the master does not know this schedule also it means that it is an old schedule that was deleted. But the following could happen: Heartbeat processed by the master, but response not yet processed by tserver. The new schedule is created and sent to the tablet. Then the tablet receives the response to this heart, and it would not contain such a schedule. To avoid interpreting such schedule as deleted we wait that `snapshot_schedules_version_` to be incremented twice, before marking schedule as deleted at tserver. Test Plan: ybd --gtest_filter YbAdminSnapshotScheduleTest.Delete Reviewers: skedia, bogdan Reviewed By: bogdan Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D11868

Summary: This diff adds the ability to delete snapshot schedules. There is a tricky part in detecting deleted schedules on the TServer side. Suppose at some point TSTabletManager does not know schedule1. Since it knows schedules received by heartbeat there are 2 possible options: 1) It is a new schedule that was not yet received by heartbeat. 2) It is deleted schedule. So to distinguish between those 2 cases it uses the following logic. Each time the heartbeat response is received and the schedules list is updated, we also increment the `snapshot_schedules_version_` field. All missing schedules are added to the special map, along with the current value of `snapshot_schedules_version_`. So when we again find the schedule as missing, we could compare the current `snapshot_schedules_version_` and version that we had when the schedule was first found as missing. So if the master does not know this schedule also it means that it is an old schedule that was deleted. But the following could happen: Heartbeat processed by the master, but response not yet processed by tserver. The new schedule is created and sent to the tablet. Then the tablet receives the response to this heart, and it would not contain such a schedule. To avoid interpreting such schedule as deleted we wait that `snapshot_schedules_version_` to be incremented twice, before marking schedule as deleted at tserver. Original commit: D11868 / 7453192 Test Plan: ybd --gtest_filter YbAdminSnapshotScheduleTest.Delete Jenkins: rebase: 2.6 Reviewers: skedia, bogdan Reviewed By: bogdan Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D12000

sanketkedia · 2021-09-27T09:30:37Z

This is a good intro task to get familiar with the basics like how to add a client-side interface, define protobufs for communication between client and server, make RPC calls, understand the translation of table rows to rocksdb key-value pairs, perform a simple sys catalog write, etc. delete_schedule has already been implemented. For edit_schedule, typically the following changes are needed:

Define the format of the edit_schedule command and add it in yb-admin. The format should be such that it takes in the id of the snapshot schedule that the user wants to edit, allows for updating the interval and/or the retention. To add the command look at this similar code
Perform client-side validation checks such as the id should exist, the format should be valid, etc, and error out accordingly. For verifying if the snapshot schedule id is valid, you can invoke the ListSnapshotSchedules RPC. See here for how to invoke the ListSnapshotSchedules RPC.
Define and Invoke the EditSnapshotSchedule RPC to send the request to the server (intercepted by the catalog manager of the master). You have to define this RPC as well as the corresponding request and response protobufs. For defining the protobufs look at this proto file
At the server-side, update the corresponding entry in the sys catalog to reflect the new interval and retention. Look at this code that performs a similar write to the sys catalog
Update the in-memory schedules_ structure to reflect the new values. You can use this function to update by loading the new entry from the sys catalog
Write tests that validate that snapshots are getting created once every new interval. Also, old snapshots should get garbage collected every retention period. You can add a test here

cc @bmatican @spolitov

jordans6 · 2022-02-24T01:13:15Z

Hey, @bmatican, if no one is working on this task currently I'd love to pick it up!

bmatican · 2022-02-24T18:13:05Z

@jordans6 Glad to hear you're excited to be contributing to YB. That sounds great, hopefully @sanketkedia 's message above helps as a starting point!

If you need anymore help, feel free to join our public slack and ask questions on the #contributors channel!

jordans6 · 2022-02-24T21:21:56Z

Awesome, I'm looking forward to contributing! Would you be able to send me a link to the slack channel? The one in the readme gives me the following error message:
"This link is no longer active. To join this workspace, you’ll need to ask the person who originally invited you for a new link."

SrivastavaAnubhav · 2022-02-24T21:27:58Z

Hi @jordans6, here's a link for the slack: https://communityinviter.com/apps/yugabyte-db/register
(I'll fix the README link in a bit.)

jordans6 · 2022-02-24T21:35:47Z

Thanks, I've joined the slack now.

druzac · 2022-05-19T22:06:41Z

Is an interval of 0 or retention_duration of 0 valid? create_snapshot_schedule currently accepts both. I suppose the semantics for a 0 interval are as soon as possible, and the semantics of a 0 retention_duration are an infinite TTL?

sanketkedia · 2022-05-19T23:28:51Z

That's good point @druzac I would say we can just disallow 0 as the interval or retention instead of getting into the confusion and having to explain to users as to what means what?

sanketkedia · 2022-05-20T17:43:51Z

cc @vkulichenko

Summary: Added edit_snapshot_schedule command to yb-admin, along with an RPC in master to edit a snapshot schedule. Moved validation of schedule invariants on the CreateSnapshotSchedule path from the command line tool to the server request handling code. Also added checks for 0 interval and 0 retention snapshot schedules to the CreateSnapshotSchedule path. Refactored MasterSnapshotCoordinator and SnapshotScheduleState to de-duplicate serialization to docdb. Test Plan: YbAdminSnapshotScheduleTest.EditInterval YbAdminSnapshotScheduleTest.EditRetention YbAdminSnapshotScheduleTest.EditSnapshotScheduleCheckOptions YbAdminSnapshotScheduleTest.EditIntervalZero YbAdminSnapshotScheduleTest.EditRetentionZero YbAdminSnapshotScheduleTest.EditRepeatedInterval YbAdminSnapshotScheduleTest.EditRepeatedRetention YbAdminSnapshotScheduleTest.EditIntervalLargerThanRetention YbAdminSnapshotScheduleTest.CreateIntervalZero YbAdminSnapshotScheduleTest.CreateRetentionZero YbAdminSnapshotScheduleTest.CreateIntervalLargerThanRetention YbAdminSnapshotScheduleTest.EditIntervalAndRetention Also played around with a test cluster on the command line. Reviewers: skedia Reviewed By: skedia Subscribers: bogdan, ybase Differential Revision: https://phabricator.dev.yugabyte.com/D17087

bmatican added the area/docdb YugabyteDB core features label May 12, 2021

bmatican assigned spolitov May 12, 2021

bmatican mentioned this issue May 14, 2021

[docdb] PITR: Tracking issue #7120

Closed

bmatican assigned sanketkedia Jun 8, 2021

sanketkedia added the good first issue This is a good issue to start contributing! label Sep 27, 2021

bmatican assigned SrivastavaAnubhav and unassigned spolitov, sanketkedia and SrivastavaAnubhav Oct 18, 2021

bmatican mentioned this issue Dec 3, 2021

[Docs] Add reference to yb-admin delete_snapshot_schedule #10779

Closed

bmatican assigned jordans6 Feb 24, 2022

lingamsandeep assigned druzac and unassigned jordans6 May 5, 2022

yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 9, 2022

druzac closed this as completed Aug 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docdb] PITR: Implement delete_schedule and edit_schedule APIs #8417

[docdb] PITR: Implement delete_schedule and edit_schedule APIs #8417

bmatican commented May 12, 2021 •

edited by yugabyte-ci

Loading

sanketkedia commented Sep 27, 2021 •

edited

Loading

jordans6 commented Feb 24, 2022

bmatican commented Feb 24, 2022

jordans6 commented Feb 24, 2022

SrivastavaAnubhav commented Feb 24, 2022

jordans6 commented Feb 24, 2022

druzac commented May 19, 2022

sanketkedia commented May 19, 2022

sanketkedia commented May 20, 2022

[docdb] PITR: Implement delete_schedule and edit_schedule APIs #8417

[docdb] PITR: Implement delete_schedule and edit_schedule APIs #8417

Comments

bmatican commented May 12, 2021 • edited by yugabyte-ci Loading

sanketkedia commented Sep 27, 2021 • edited Loading

jordans6 commented Feb 24, 2022

bmatican commented Feb 24, 2022

jordans6 commented Feb 24, 2022

SrivastavaAnubhav commented Feb 24, 2022

jordans6 commented Feb 24, 2022

druzac commented May 19, 2022

sanketkedia commented May 19, 2022

sanketkedia commented May 20, 2022

bmatican commented May 12, 2021 •

edited by yugabyte-ci

Loading

sanketkedia commented Sep 27, 2021 •

edited

Loading