-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix test testDropPrimaryDuringReplication and clean up ReplicationCheckpoint validation #8889
Commits on Aug 2, 2023
-
Fix test testDropPrimaryDuringReplication and clean up ReplicationChe…
…ckpoint validation. This test is now occasionally failing with replicas having 0 documents. This occurs in a couple of ways: 1. After dropping the old primary the new primary is not publishing a checkpoint to replicas unless it indexes docs from translog after flipping to primary mode. If there is nothing to index, it will not publish a checkpoint, but the other replica could have never sync'd with the original primary and be left out of date. - This PR fixes this by force publishing a checkpoint after the new primary flips to primary mode. 2. The replica receives a checkpoint post failover and cancels its sync with the former primary that is still active, recognizing a primary term bump. However this cancellation is async and immediately starting a new replication event could fail as its still replicating. - This PR fixes this by attempting to process the latest received checkpoint on failure, if the shard is not failed and still behind. This PR also introduces a few changes to ensure the accuracy of the ReplicationCheckpoint tracked on primary & replicas. - Ensure the checkpoint stored in SegmentReplicationTarget is the checkpoint passed from the primary and not locally computed. This ensures checks for primary term are accurate and not using a locally compued operationPrimaryTerm. - Introduces a refresh listener for both primary & replica to update the ReplicationCheckpoint and store it in replicationTracker post refresh rather than redundantly computing when accessed. - Removes unnecessary onCheckpointPublished method used to start replication timers manually. This will happen automatically on primaries once its local cp is updated. Signed-off-by: Marc Handalian <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for cc01343 - Browse repository at this point
Copy the full SHA cc01343View commit details -
Handle NoSuchFileException when attempting to delete decref'd files.
To avoid divergent logic with remote store, we always incref/decref the segmentinfos.files(true) which includes the segments_n file. Decref to 0 will attempt to delete the file from the store and its possible this _n file does not yet exist. This change will ignore if we get a noSuchFile while attempting to delete. Signed-off-by: Marc Handalian <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6aa232b - Browse repository at this point
Copy the full SHA 6aa232bView commit details -
Configuration menu - View commit details
-
Copy full SHA for e3a5366 - Browse repository at this point
Copy the full SHA e3a5366View commit details -
Clean up IndexShardTests.testCheckpointReffreshListenerWithNull
Signed-off-by: Marc Handalian <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 95ab3b0 - Browse repository at this point
Copy the full SHA 95ab3b0View commit details -
Remove unnecessary catch for NoSuchFileException.
Signed-off-by: Marc Handalian <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 54bbbd3 - Browse repository at this point
Copy the full SHA 54bbbd3View commit details -
Add another test for non segrep.
Signed-off-by: Marc Handalian <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ad3d7bb - Browse repository at this point
Copy the full SHA ad3d7bbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 804c203 - Browse repository at this point
Copy the full SHA 804c203View commit details
Commits on Aug 3, 2023
-
re-compute replication checkpoint on primary promotion.
Signed-off-by: Marc Handalian <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d0e15b8 - Browse repository at this point
Copy the full SHA d0e15b8View commit details