Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DirectChainSpec flakiness #493

Merged
merged 4 commits into from
Sep 21, 2022
Merged

Conversation

abailly-iohk
Copy link
Contributor

@abailly-iohk abailly-iohk commented Sep 15, 2022

This PR scratches an itch that's been bugging me for some time, eg, it fixes (EDIT: DOES NOT FIX) #492. What happens (I think) is that when seeding some UTxO from faucet we check the Utxo is visible from the Cardano client, but this does not mean the UTxO is already visible from the TinyWallet which needs to sync up its internal state using the ChainSync protocol, thus leading into a race condition.
There could be alternative solutions but the one provided here is reasonably simple and it seems to me to make sense to have a way to observe the UTxO from the Chain component given they are always expected to be isomorphic to the L2 ones.

It helps tracing at which point this issue happens
@github-actions
Copy link

github-actions bot commented Sep 15, 2022

Transactions Costs

Sizes and execution budgets for Hydra protocol transactions. Note that unlisted parameters are currently using arbitrary values and results are not fully deterministic and comparable to previous runs.

Metadata
Generated at 2022-09-19 13:21:23.587592338 UTC
Max. memory units 14000000
Max. CPU units 10000000000
Max. tx size (kB) 16384

Cost of Init Transaction

Parties Tx size % max Mem % max CPU Min fee ₳
1 4833 11.84 4.71 0.49
2 5037 11.72 4.61 0.50
3 5241 16.30 6.45 0.56
5 5651 17.41 6.79 0.59
10 6678 29.37 11.46 0.76
46 14063 99.90 38.61 1.85

Cost of Commit Transaction

Currently only one UTxO per commit allowed (this is about to change soon)

UTxO Tx size % max Mem % max CPU Min fee ₳
1 5768 19.81 7.99 0.62

Cost of CollectCom Transaction

Parties Tx size % max Mem % max CPU Min fee ₳
1 13130 21.44 8.60 0.96
2 13450 37.36 15.13 1.15
3 13851 56.68 23.15 1.38
4 14023 76.75 31.47 1.61

Cost of Close Transaction

Parties Tx size % max Mem % max CPU Min fee ₳
1 9327 8.19 3.36 0.65
2 9490 8.97 3.81 0.67
3 9692 10.19 4.44 0.69
5 10016 11.75 5.34 0.72
10 10879 16.09 7.77 0.81
30 14222 32.68 17.18 1.16
52 14421 33.04 12.54 1.14

Cost of Contest Transaction

Parties Tx size % max Mem % max CPU Min fee ₳
1 9363 8.60 3.51 0.66
2 9563 9.82 4.14 0.68
3 9658 9.72 4.24 0.68
5 10059 12.25 5.53 0.73
10 10875 16.25 7.82 0.82
30 14208 32.28 17.02 1.16
43 16365 43.11 23.15 1.39

Cost of Abort Transaction

Some variation because of random mixture of still initial and already committed outputs.

Parties Tx size % max Mem % max CPU Min fee ₳
1 13495 22.50 9.35 0.99
2 13609 34.02 14.28 1.13
3 13927 52.63 22.47 1.35
4 14610 87.17 39.31 1.78
5 13934 81.23 34.74 1.67

Cost of FanOut Transaction

Involves spending head output and burning head tokens. Uses ada-only UTxO for better comparability.

UTxO Tx size % max Mem % max CPU Min fee ₳
1 13421 10.17 4.43 0.85
2 13517 11.78 5.34 0.88
3 13491 12.92 6.07 0.89
5 13692 17.02 8.25 0.95
10 13806 24.25 12.49 1.04
50 15184 84.50 47.45 1.84
59 15569 98.60 55.53 2.03

@abailly-iohk abailly-iohk force-pushed the abailly-iohk/fix-direct-chain-flakiness branch from f926b24 to a95651a Compare September 15, 2022 16:35
@abailly-iohk abailly-iohk requested review from ch1bo, KtorZ and ffakenz and removed request for ch1bo and KtorZ September 15, 2022 16:52
@github-actions
Copy link

github-actions bot commented Sep 15, 2022

Unit Test Results

253 tests  ±0   247 ✔️ ±0   16m 14s ⏱️ +10s
  91 suites ±0       6 💤 ±0 
    5 files   ±0       0 ±0 

Results for commit f43c546. ± Comparison against base commit b0ebec9.

♻️ This comment has been updated with latest results.

This update is larger than the scope of this PR
@abailly-iohk
Copy link
Contributor Author

This PR is somewhat related to #366 but with a much limited scope. In the latter issue it is said that in case of a failed posting, tx should simply be dropped. This PR provide an alternative to dropping and retrying by giving the node the ability to understand what the chain component knows and decide to wait for a UTxO to appear before posting a transaction.

@abailly-iohk
Copy link
Contributor Author

For the record: I tried the alternative approach of retrying postTx on CannotSpendInput exceptions and does not work well. Not too sure why, though, but tests are back to being flaky so I would rather stick to the solution that works.
/cc @ffakenz

@abailly-iohk
Copy link
Contributor Author

The reason why retrying submission does not work in the tests is that the thrown CannotSpendInput exception actually kills the Chain component which is no longer able anymore to sync up with the chain, so retrying is useless. Short of a generic mechanism similar to #366 there's no option but to wait for the on-chain state to be what we expect it to be.

Copy link
Contributor

@ffakenz ffakenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌶️

Copy link
Collaborator

@ch1bo ch1bo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not feeling well for this change as it is only motivated by a single test and seemingly no production code is requiring getUTxO.

Does this mean that only the DirectChainSpec is prone to this race-condition?

Furthermore, @abailly-iohk, you mentioned this https://github.com/input-output-hk/hydra-poc/actions/runs/3053589859/jobs/4924406067#step:8:102 as the a motivating flakiness for this change, but that test does not even use any hydra-node component at all - definitely not the Hydra.Chain.Direct?

hydra-node/src/Hydra/Chain/Direct/Handlers.hs Outdated Show resolved Hide resolved
@abailly-iohk
Copy link
Contributor Author

Well, I could say that if there is a need for a (ETE) test to observe the UTxO, this might be an indication there is somewhere a "real" need for this that's awaiting. Driving features from tests is also a good way to uncover missing parts.

I have only observed the flakiness in these tests but the ETE tests might be subject to it too. Perhaps the fact we have more nodes running, in a more involved setup, give the internal wallet enough time to catch-up?

And yes, the motivating example is unrelated, it's a problem related to CI resources which this PR does not solve.

@ch1bo ch1bo merged commit 97a5650 into master Sep 21, 2022
@ch1bo ch1bo deleted the abailly-iohk/fix-direct-chain-flakiness branch September 21, 2022 12:05
v0d1ch pushed a commit that referenced this pull request Sep 26, 2022
ch1bo added a commit that referenced this pull request Sep 27, 2022
…-direct-chain-flakiness"

This reverts commit 97a5650, reversing
changes made to b0ebec9.
v0d1ch pushed a commit that referenced this pull request Sep 28, 2022
…-direct-chain-flakiness"

This reverts commit 97a5650, reversing
changes made to b0ebec9.
v0d1ch pushed a commit that referenced this pull request Sep 28, 2022
…-direct-chain-flakiness"

This reverts commit 97a5650, reversing
changes made to b0ebec9.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants