Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry and timeout in unit test #1504

Closed
wants to merge 1 commit into from
Closed

Conversation

ianco
Copy link
Member

@ianco ianco commented Jan 15, 2021

Signed-off-by: Ian Costanzo [email protected]

Increases likelihood of test passing from roughly 50% to 90%

@sovbot
Copy link
Contributor

sovbot commented Jan 15, 2021

Can one of the admins verify this patch?

@WadeBarnes
Copy link
Member

[ci] please test

@WadeBarnes
Copy link
Member

(ci) please test

@WadeBarnes
Copy link
Member

(ci) test this please

Copy link
Contributor

@Toktar Toktar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Ian,
thanks for your emphasize an issue with failed tests. I really apologize if the logic of these tests is a bit implicit. But these are simulation tests for checking the consensus. A random seed is selected with each new launch. That is, if you just restart a test, then you ignore a possible critical issue in the consensus logic (or view change in this case). Could you please write down the seed on which the test falls in the list of exceptions if you do not have time to figure out what is wrong?
I suggest not to put into the master branch rule to ignore tests that can find real bugs in the code.
As far as I understand it's a start of creating simulation tests:
#1242
#1246
It's a random article about seeds in simulation or randomization testing: https://sciprincess.wordpress.com/2019/03/14/how-to-select-a-seed-for-simulation-or-randomization/
Tomorrow I'll ask an author of these tests to get a better link to describing this approach.
UPD: The link to article from our expert: https://alexwlchan.net/2016/06/hypothesis-intro/

@ianco
Copy link
Member Author

ianco commented Jan 20, 2021

Hello Ian,
thanks for your emphasize an issue with failed tests. I really apologize if the logic of these tests is a bit implicit. But these are simulation tests for checking the consensus. A random seed is selected with each new launch. That is, if you just restart a test, then you ignore a possible critical issue in the consensus logic (or view change in this case). Could you please write down the seed on which the test falls in the list of exceptions if you do not have time to figure out what is wrong?
I suggest not to put into the master branch rule to ignore tests that can find real bugs in the code.
As far as I understand it's a start of creating simulation tests:
#1242
#1246
It's a random article about seeds in simulation or randomization testing: https://sciprincess.wordpress.com/2019/03/14/how-to-select-a-seed-for-simulation-or-randomization/
Tomorrow I'll ask an author of these tests to get a better link to describing this approach.

@Toktar The unit test creates a simulated pool using a random number of nodes, and then creates a random number of votes, so there is not a single "seed" per se that causes the test to fail. We suspect it is a timing issue hence the retries.

There are too many layers of logic for me to try to dig into it unfortunately :-( If you can get someone to take a look, they just need to retry the test several times and they should get some failures - for me & Wade this test fails about half the time (or about 10% of the time with the retries).

@Toktar
Copy link
Contributor

Toktar commented Jan 22, 2021

@ianco In our case a seed is a random parameter and the number that you can see near the name of a failed test in logs. Looks like
...
test_new_view_combinations(440868)
...
The fails reproduce locally for me. And it's really a consensus problem for a ViewChange protocol but just with a medium risk.
I created an issue with description of the failed case #1506
And I'll send a PR with skipping tests for problem seeds. Could you please extend this exception list if find a new one?

@ianco
Copy link
Member Author

ianco commented Jan 22, 2021

@Toktar thanks for the notes, I'll re-test and try to track the failing seeds.

@ianco ianco closed this Feb 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants