Enhance SnapshotResiliencyTests (#49514) #49541

original-brownbear · 2019-11-25T11:37:38Z

A few enhancements to SnapshotResiliencyTests:

Test running requests from random nodes in more spots to enhance coverage (this is particularly motivated by Use ClusterState as Consistency Source for Snapshot Repositories #49060 where the additional number of cluster state updates makes it more interesting to fully cover all kinds of network failures)
Fix issue with restarting only master node in one test (doing so breaks the test at an incredibly low frequency, that becomes not so low in Use ClusterState as Consistency Source for Snapshot Repositories #49060 with the additional cluster state updates between request and response)
Improved cluster formation checks (now properly checks the term as well when forming cluster) + makes sure all nodes are connected to all other nodes (previously the data nodes would at times not be connected to other data nodes, which was shaken out now by adding the client() method
Make sure the cluster left behind by the test makes sense by running the repo cleanup action on it (this also increases coverage of the repository cleanup action obviously and adds the basis of making it part of more resiliency tests)

backport of #49514

A few enhancements to `SnapshotResiliencyTests`: 1. Test running requests from random nodes in more spots to enhance coverage (this is particularly motivated by elastic#49060 where the additional number of cluster state updates makes it more interesting to fully cover all kinds of network failures) 2. Fix issue with restarting only master node in one test (doing so breaks the test at an incredibly low frequency, that becomes not so low in elastic#49060 with the additional cluster state updates between request and response) 3. Improved cluster formation checks (now properly checks the term as well when forming cluster) + makes sure all nodes are connected to all other nodes (previously the data nodes would at times not be connected to other data nodes, which was shaken out now by adding the `client()` method 4. Make sure the cluster left behind by the test makes sense by running the repo cleanup action on it (this also increases coverage of the repository cleanup action obviously and adds the basis of making it part of more resiliency tests)

elasticmachine · 2019-11-25T11:37:40Z

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

original-brownbear added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs backport labels Nov 25, 2019

original-brownbear merged commit 2502ff3 into elastic:7.x Nov 25, 2019

original-brownbear deleted the 49514-7.x branch November 25, 2019 12:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance SnapshotResiliencyTests (#49514) #49541

Enhance SnapshotResiliencyTests (#49514) #49541

original-brownbear commented Nov 25, 2019

elasticmachine commented Nov 25, 2019

Enhance SnapshotResiliencyTests (#49514) #49541

Enhance SnapshotResiliencyTests (#49514) #49541

Conversation

original-brownbear commented Nov 25, 2019

elasticmachine commented Nov 25, 2019