Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport 2.x] Fix flaky test SegmentReplicationTargetServiceTests#testShardAlreadyReplicating #13265

Merged
merged 1 commit into from
Apr 17, 2024

Conversation

mch2
Copy link
Member

@mch2 mch2 commented Apr 17, 2024

Manual backport of #13248 to 2.x.

This PR includes two additional changes to bring this file in sync with main.

59a62e7 updated localNode definition on 2.x but not on main.

e942483 unmuted this test on main but was not backported.

The rest is the same.

…eplicating (opensearch-project#13248)

This test is flaky because it is incorrectly passing a checkpoint with a higher primary term on the second invocation.
This will cancel the first replication and start another.  The test sometimes passes because it is only asserting on processLatestReceivedCheckpoint.
If the cancellation quickly completes before attempting second replication event the test will fail, otherwise it will pass.

Fixed this test by ensuring the pterm is the same, but the checkpoint is ahead.  Also added assertion that replication is not started with the exact ahead checkpoint
instead of only processLatestReivedCheckpoint. Tests already exist for ahead primary term "testShardAlreadyReplicating_HigherPrimaryTermReceived".

Signed-off-by: Marc Handalian <[email protected]>
(cherry picked from commit 1fcb79d)
Copy link
Contributor

❌ Gradle check result for 15c5baf:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@mch2
Copy link
Member Author

mch2 commented Apr 17, 2024

❌ Gradle check result for 15c5baf:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

504 Gateway Time-out

Copy link
Contributor

❌ Gradle check result for 15c5baf:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 15c5baf: SUCCESS

Copy link

codecov bot commented Apr 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.03%. Comparing base (0dd892c) to head (15c5baf).
Report is 152 commits behind head on 2.x.

Additional details and impacted files
@@             Coverage Diff              @@
##                2.x   #13265      +/-   ##
============================================
- Coverage     71.28%   71.03%   -0.26%     
- Complexity    60145    60528     +383     
============================================
  Files          4957     5004      +47     
  Lines        282799   285354    +2555     
  Branches      41409    41716     +307     
============================================
+ Hits         201591   202693    +1102     
- Misses        64189    65526    +1337     
- Partials      17019    17135     +116     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dblock dblock merged commit 648b2ac into opensearch-project:2.x Apr 17, 2024
50 of 51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants