[BUG] Dual Replication - Failover to remote replica from remote primary fails when the replication group contains a docrep index #13158

shourya035 · 2024-04-11T12:10:08Z

Describe the bug

In a situation where a replication group has at-least one docrep shard copy, failover from a remote primary to a remote replica fails with no retention lease for tracked shard

During dual replication phase, RetentionLeases generated on the primary shard, is synced over to the docrep copy through the RetentionLeaseBackgroundSyncAction, but we block the replication call to remote enabled replica copies. When the primary shard copy fails over to another remote enabled replica, the invariant() check fails.

This is how the code flows. During a failover, the activatePrimaryMode() method of ReplicationTracker is invoked

OpenSearch/server/src/main/java/org/opensearch/index/shard/IndexShard.java

Lines 784 to 790 in 7103e56

    
           replicationTracker.activatePrimaryMode(getLocalCheckpoint()); 
        
           if (indexSettings.isSegRepEnabledOrRemoteNode()) { 
        
               // force publish a checkpoint once in primary mode so that replicas not caught up to previous primary 
        
               // are brought up to date. 
        
               checkpointPublisher.publish(this, getLatestReplicationCheckpoint()); 
        
           } 
        
           postActivatePrimaryMode();

This enabled the primaryMode flag for the ReplicationTracker instance, updates global and local Ckp, creates retention lease for itself and runs the invariant() checks

OpenSearch/server/src/main/java/org/opensearch/index/seqno/ReplicationTracker.java

Lines 1359 to 1364 in 7103e56

    
           primaryMode = true; 
        
           updateLocalCheckpoint(shardAllocationId, checkpoints.get(shardAllocationId), localCheckpoint); 
        
           updateGlobalCheckpointOnPrimary(); 
        
           addPeerRecoveryRetentionLeaseForSolePrimary(); 
        
           assert invariant();

The invariant() method checks for retention leases again all replicated shard copies. During dual replication all docrep shard copies are marked as replicated.

OpenSearch/server/src/main/java/org/opensearch/index/seqno/ReplicationTracker.java

Lines 958 to 975 in 7103e56

    
           if (primaryMode && indexSettings.isSoftDeleteEnabled() && hasAllPeerRecoveryRetentionLeases) { 
        
               // all tracked shard copies have a corresponding peer-recovery retention lease 
        
               for (final ShardRouting shardRouting : routingTable.assignedShards()) { 
        
                   final CheckpointState cps = checkpoints.get(shardRouting.allocationId().getId()); 
        
                   if (cps.tracked && cps.replicated) { 
        
                       assert retentionLeases.contains(getPeerRecoveryRetentionLeaseId(shardRouting)) 
        
                           : "no retention lease for tracked shard [" + shardRouting + "] in " + retentionLeases; 
        
                       assert PEER_RECOVERY_RETENTION_LEASE_SOURCE.equals( 
        
                           retentionLeases.get(getPeerRecoveryRetentionLeaseId(shardRouting)).source() 
        
                       ) : "incorrect source [" 
        
                           + retentionLeases.get(getPeerRecoveryRetentionLeaseId(shardRouting)).source() 
        
                           + "] for [" 
        
                           + shardRouting 
        
                           + "] in " 
        
                           + retentionLeases; 
        
                   } 
        
               } 
        
           }

Since retention leases weren't copied over from the primary shard instance, the assertion trips here.

We need to re-create retention leases for docrep shard copies and hold off from invoking this assertion until the leases are created.

Related component

Storage:Remote

To Reproduce

N/A

Expected behavior

Failover from both remote primary to both docrep and remote replicas should work seamlessly during the dual replication phase

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

OS: [e.g. iOS]
Version [e.g. 22]

Additional context
Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

shourya035 added the bug Something isn't working label Apr 11, 2024

shourya035 self-assigned this Apr 11, 2024

github-actions bot added Storage:Remote untriaged labels Apr 11, 2024

shourya035 removed the untriaged label Apr 11, 2024

shourya035 mentioned this issue Apr 11, 2024

[Remote Store - Dual Replication] Create missing Retention Leases for docrep shard copies during failover #13159

Merged

8 tasks

gbbafna closed this as completed in #13159 Apr 16, 2024

shourya035 mentioned this issue Apr 24, 2024

[Backport 2.x - Remote Store - Dual Replication] Create missing Retention Leases for docrep shard copies during failover #13364

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Dual Replication - Failover to remote replica from remote primary fails when the replication group contains a docrep index #13158

[BUG] Dual Replication - Failover to remote replica from remote primary fails when the replication group contains a docrep index #13158

shourya035 commented Apr 11, 2024

[BUG] Dual Replication - Failover to remote replica from remote primary fails when the replication group contains a docrep index #13158

[BUG] Dual Replication - Failover to remote replica from remote primary fails when the replication group contains a docrep index #13158

Comments

shourya035 commented Apr 11, 2024

Describe the bug

Related component

To Reproduce

Expected behavior

Additional Details