Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unreliable test cases in CI #931

Closed
KtorZ opened this issue Oct 29, 2019 · 10 comments
Closed

Unreliable test cases in CI #931

KtorZ opened this issue Oct 29, 2019 · 10 comments

Comments

@KtorZ
Copy link
Member

KtorZ commented Oct 29, 2019

Release Operating System Cause
next Windows & OSX & Linux) Code v Configuration

Context

https://buildkite.com/input-output-hk/cardano-wallet/builds/3377#97b21363-a48b-4a51-9787-eb22ff210771/6-7571

https://buildkite.com/input-output-hk/cardano-wallet/builds/3308#c9de0620-ac75-4a16-95b1-7bf2b37dd666/6-7555

https://buildkite.com/input-output-hk/cardano-wallet/builds/3307#7eb3b6dd-7a93-4989-a033-fcc042bf26d5/6-7564

https://buildkite.com/input-output-hk/cardano-wallet/builds/3283#88ba9e84-0cb0-43ea-98f9-2e8d9cf40953/6-7560

https://buildkite.com/input-output-hk/cardano-wallet/builds/3524

https://buildkite.com/input-output-hk/cardano-wallet/builds/3525
https://buildkite.com/input-output-hk/cardano-wallet/builds/3524

Steps to Reproduce

Unclear. The test sometimes fail.

Expected behavior

The test is not flaky and can be trusted.

Actual behavior

🤷‍♂️


Resolution Plan

PR

Number Base
#? develop

QA

@KtorZ KtorZ changed the title Unreliable test failure in Jormungandr.NetworkSpec Unreliable test case in Jormungandr.NetworkSpec Oct 29, 2019
@rvl
Copy link
Contributor

rvl commented Nov 1, 2019

I noticed this message in the logs of the first buildkite build (3377) listed in this ticket:

    thread 'Oct 29 10:23:27.014leadership1 ' panicked at 'CRITnot yet implemented: The system just failed to compute an appropriate instant.
    This could be due to a system suspension or hibernation, in order not to miss out on future
    leader elections please prevent your system from suspending or hibernating.
     ', system recorded a 0s delay. This could be due to a system suspension or hibernation, in order not to miss out on future leader elections please prevent your system from suspending or hibernating.jormungandr/src/leadership/mod.rs, :epoch244:: 175380849
    , note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
    reason: second time provided was later than self, slot_system_time: 2019-10-29T10:23:27+00:00, now: 2019-10-29T10:23:27.014508974+00:00, task: leadership
    not yet implemented: The system just failed to compute an appropriate instant.
    This could be due to a system suspension or hibernation, in order not to miss out on future
    leader elections please prevent your system from suspending or hibernating.

That would probably explain the ExitFailure 66.

@KtorZ KtorZ changed the title Unreliable test case in Jormungandr.NetworkSpec Unreliable test cases in CI Nov 2, 2019
@rvl
Copy link
Contributor

rvl commented Nov 5, 2019

@jonathanknowles
Copy link
Member

@jonathanknowles
Copy link
Member

@KtorZ
Copy link
Member Author

KtorZ commented Nov 11, 2019

We could try increasing a bit the epoch length. We've set it to 3 seconds and it seems to be causing issues as explained in this comment:

If the end of epoch/start of new epoch happens between node start time and leadership task start/process time

iohk-bors bot added a commit that referenced this issue Nov 12, 2019
1003: Attempt to fix intermittent startup issues with Jörmungandr r=KtorZ a=KtorZ

# Issue Number

<!-- Put here a reference to the issue this PR relates to and which requirements it tackles -->

#931 

# Overview

<!-- Detail in a few bullet points the work accomplished in this PR -->

- [x] I have increased the epoch length to give Jörmungandr more time to start before an epoch changes. 

- [x] I have also changed some hard-coded `/` path separator to use the platform-agnostic version `</>`. 

# Comments

<!-- Additional comments or screenshots to attach if any -->

Not sure if this will help, but we can try 🤷‍♂️ 

<!-- 
Don't forget to:

 ✓ Self-review your changes to make sure nothing unexpected slipped through
 ✓ Assign yourself to the PR
 ✓ Assign one or several reviewer(s)
 ✓ Once created, link this PR to its corresponding ticket
 ✓ Acknowledge any changes required to the Wiki
-->


1010: Refactor test for 1004 regression r=KtorZ a=piotr-iohk

# Issue Number

#1004

# Overview

<!-- Detail in a few bullet points the work accomplished in this PR -->

- [ ] I have refactored the script testing #1004 to make it more clear (it was not clear to me until I figured that it is a tx to self address). Added clearer description and additional check for _in_ledger_ tx
- [ ] updated `start_node` script


# Comments

<!-- Additional comments or screenshots to attach if any -->

<!-- 
Don't forget to:

 ✓ Self-review your changes to make sure nothing unexpected slipped through
 ✓ Assign yourself to the PR
 ✓ Assign one or several reviewer(s)
 ✓ Once created, link this PR to its corresponding ticket
 ✓ Acknowledge any changes required to the Wiki
-->


Co-authored-by: KtorZ <[email protected]>
Co-authored-by: Piotr Stachyra <[email protected]>
@KtorZ KtorZ added this to the Usability & Compatibility milestone Nov 15, 2019
@KtorZ
Copy link
Member Author

KtorZ commented Nov 22, 2019

I believe this one can be closed in favor of another item "re-enable code coverage calculations in CI".

@piotr-iohk , thoughts?

@piotr-iohk
Copy link
Contributor

Sounds good.

"re-enable code coverage calculations in CI"

Something to plan for recovery week I suppose?

@KtorZ
Copy link
Member Author

KtorZ commented Nov 22, 2019

Possibly, although I'd just create a JIRA story for that one. This is important enough for us to spend a proper time on this.

@KtorZ
Copy link
Member Author

KtorZ commented Nov 22, 2019

@piotr-iohk
Copy link
Contributor

Although it is not a product feature, I understand it goes to Jira so we can book relevant amount of time for this, and also let product now about how important it is... 🤷‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants