Light Client model-based testing #529

andrey-kuprianov · 2020-08-21T10:51:30Z

See #414. Also see Light Client model-based testing guide for the high-level description how model-based testing will be working.

This PR touches quite a lot of the repository, so I will be grateful for multiple reviews. It is mostly about building a lot of infrastructure that enables model-based testing, so it should be easy to extend now to other types of tests (bisection), or other crates/repos when needed.

One particular issue is that it uncovered a number of discrepancies between the model and the implementation of the LIghtClient; please see the changes to the Lightclient_A_1.tla / Blockchain_A_1.tla I have fixed them by adjusting the model, but it would be nice if @josef-widder or @konnov take a look at it.

In general, I hope, this is the first step in the right direction of keeping the specs live and in sync with the implementation, and of thorough / to some extent exhaustive testing of the implementation.

Referenced an issue explaining the need for the change
Updated all relevant documentation in docs
Updated all code comments where relevant
Wrote tests
Updated CHANGELOG.md

… commands (e.g. apalache, jsonatr)

* defined structure for model-based single-step light client tests * added a simple driver for such tests modeled after original one in tests/lite.rs * add the first auto-generated model-based test * add symlinks to the TLA+ model The important change in the test structure is that now each input block carries with it the expected verdict, not a single expected result for the whole input array: the latter is unclear how to interpret in case of failures. Besides, having such more fine-grained verdicts allows to differentialte between different types of errors returned by the light client.

…fy_single

…implementation In the current model (Blockchain_A_1.tla) the check InTrustingPeriod treats the edge case now = header.time + TRUSTING_PERIOD as valid, while the implementation treats it as invalid. ` Changed the model to follow the implementation.

shonfeder · 2020-08-21T19:43:14Z

testgen/src/command.rs

+        self.dir = Some(dir.to_owned());
+        self
+    }
+    /// Execute the command as a child process, and extract its status, stdout, stderr.


Could you say a few words about what this gives us that we wouldn't get by just using https://doc.rust-lang.org/std/process/struct.Output.html provided by command.output()?

yes, you are right, this would allow to shorten the code a bit. I was simply not aware of this at the time of writing. But I don't think it's too critical -- at the end, it's the same functionality.

Ah! My concern in that case is mostly not about code length. Rather, I'm concerned with (1) supporting readability by using the commonly available idioms and constructs rather than requiring readers to learn bespoke constructs that just duplicate existing functionality, and (2) supporting maintainability but cutting preventing the unnecessary addition of 80 lines of code to the code base. IME, both of those considerations matter enough that it is worth doing the right thing here.

Naturally, I'll deffer to others who are more in tune with the standards for this repo if they feel differently. But I have a strong view on this, and if I were the main reviewer I would insist on the change.

shonfeder · 2020-08-21T19:46:39Z

Is it possible that this could be broken into smaller PRs? Some reasonable seeming divisions that occur to me while surveying the changes:

One PR for the command.rs and apalache.rs modules
One PR for changes to existing test harness
One PR for changes to the existing tests
One PR for changes to the TLA specs

I'm not actually sure if things could decouple that way, but it would enable much better quality reviews and digestion of the work you've done here. Even if it were only possible to break into 2 or three parts, it would be a big help.

shonfeder · 2020-08-21T19:47:21Z

Also, probably worth fixing the clippy check's before a review :)

andrey-kuprianov · 2020-08-21T19:57:12Z

@shonfeder I do like your suggestion about breaking it into smaller PRs:) Not sure though I will be able to do it now, as I am on vacation till Sept 1...

As for the clippy check that fails, what they propose is actually incorrect. I.e. if you apply this you get a compiler error. I guess this points to a bug in clippy.

shonfeder · 2020-08-21T20:23:36Z

As for the clippy check that fails, what they propose is actually incorrect. I.e. if you apply this you get a compiler error. I guess this points to a bug in clippy.

I think the proper thing to do in such cases is to silence the linter check for the affected block of code https://github.com/rust-lang/rust-clippy#allowingdenying-lints (and I guess file an issue with clippy if it is also broken behavior, but that's orthogonal :).

josef-widder

I only looked that the blockchain and lightclient TLA files. It looks that some changes (I commented on them) are made to match the code, but as a result the TLA+ files may differ from the English spec. We should create an issue that highlights the changes needed in the English spec to keep everything consistent.

josef-widder · 2020-08-25T11:19:05Z

docs/spec/lightclient/verification/Blockchain_A_1.tla

@@ -75,7 +75,7 @@ LBT == [header |-> BT, Commits |-> {NT}]

 (* the header is still within the trusting period *)
 InTrustingPeriod(header) ==
-    now <= header.time + TRUSTING_PERIOD
+    now < header.time + TRUSTING_PERIOD


Does this change come from the implementation or the English spec? Are those consistent?

this comes from the implementation, as otherwise I get a testing error: spec behavior is different from what we have in the implementation. And the English spec contains this trusted.Header.Time > now - trustingPeriod, so I guess this means that the English spec and the implementation agree, but the TLA+ spec is wrong in that respect; could you please verify that it's true?

So now we do model-based testing as conformance test between English spec, TLA+, and the implementation. Amazing 😉

josef-widder · 2020-08-25T11:20:43Z

docs/spec/lightclient/verification/Lightclient_A_1.tla

+         THEN "SUCCESS"
+         ELSE "NOT_ENOUGH_TRUST"


I think OK and CANNOT_VERIFY are used in the English spec. Are SUCCESS and NOT_ENOUGH_TRUST used in the implementation? If yes, we should adapt I guess the English spec.

I've taken the verdicts from the implementation. There is another issue I've got, namely that according to the TLA+ constraints the verdict FAILED_TRUSTING_PERIOD cannot ever be produced. At least it's impossible to obtain a test with such verdict from Apalache, and my quick manual check on the TLA+ constraints tells me it's indeed logically impossible. Could you please verify that? In case it's true, it may be makes sense to remove such a verdict. And I would actually update the English spec as well, as I find the implementation names a bit more informative

I will have a look with @konnov

I don't see why FAILED_TRUSTING_PERIOD cannot be produced. It is produced in line 115. However, it is not propagated into state. Do you like to see it in state?

this PR is closed; let's discuss on another PR: 546

josef-widder · 2020-08-25T11:24:03Z

docs/spec/lightclient/verification/Blockchain_A_1.tla

@@ -110,7 +110,7 @@ FaultAssumption(pFaultyNodes, pNow, pBlockchain) ==
 (* Can a block be produced by a correct peer, or an authenticated Byzantine peer *)
 IsLightBlockAllowedByDigitalSignatures(ht, block) == 
    \/ block.header = blockchain[ht] \* signed by correct and faulty (maybe)
-    \/ block.Commits \subseteq Faulty /\ block.header.height = ht \* signed only by faulty
+    \/ block.Commits \subseteq Faulty /\ block.header.height = ht /\ block.header.time > 0 \* signed only by faulty


I guess faulty processes may be allowed to write any time, also negative. Or is the time stored in some unsigned domain in the implementation?

it is indeed stored and serialized/deserialized as an unsigned integer, so signed integers do no make sense and only add confusion, I am afraid

I guess in the English spec we implicitly also assume positive times. Perhaps we should clarify it there explicitly.

konnov

Nice trick with the history variable. However, instead of directly extending Lightclient_A_1.tla, I would recommend to extend Lightclient_A_1 with another module that monitors the light client with the history variable.

andrey-kuprianov · 2020-08-28T19:26:29Z

It probably does make a lot of sense to split this PR into multiple ones, as @shonfeder proposed; I will do that when I am back from my vacation on Sept. 2. Just write it here so please do not spend your time reviewing this one.

ebuchman · 2020-08-28T22:36:16Z

It probably does make a lot of sense to split this PR into multiple ones, as @shonfeder proposed; I will do that when I am back from my vacation on Sept. 2. Just write it here so please do not spend your time reviewing this one.

Let's close this in the meantime then. Thanks for breaking it up!

shonfeder · 2020-09-08T17:00:19Z

Thank you for the PR refactor! Will dig into reviewing whatever is still open ASAP. 🙏

#547) * #414: testgen tester -- utilities to run multiple tests with logs and reports * #547 add missing file updates from #529 * fix merge typo * TestEnv: change path parameters into AsRef<Path> * change TestEnv::full_path to return PathBuf * apply simplifications suggested by Romain * apply simplification from Romain * account for WOW Romain's suggestion on RefUnwindSafe * address Romain's suggestion on TestEnv::cleanup * cargo clippy * update CHANGELOG.md

* #414: testgen tester -- utilities to run multiple tests with logs and reports * 14: add testgen commands, in particular to call apalache and jsonatr * #547 add missing file updates from #529 * fix merge typo * remove ref to time: it's in another PR * fix clippy warning * TestEnv: change path parameters into AsRef<Path> * change TestEnv::full_path to return PathBuf * apply simplifications suggested by Romain * apply simplification from Romain * account for WOW Romain's suggestion on RefUnwindSafe * address Romain's suggestion on TestEnv::cleanup * cargo clippy * update CHANGELOG.md * addressed Romains's suggestions

* #414: testgen tester -- utilities to run multiple tests with logs and reports * 14: add testgen commands, in particular to call apalache and jsonatr * #414: add model-based test driver to LightClient tests * add talk abstract * #547 add missing file updates from #529 * fix merge typo * remove ref to time: it's in another PR * fix clippy warning * fix merging typo * cargo fmt

andrey-kuprianov added 30 commits July 31, 2020 11:33

#414: factor out common parts from tests/lite.rs into tests/lite_tests

d9c6b5e

#414: lite_tests: add Command/CommandRun wrapper for running external…

0ac86a3

… commands (e.g. apalache, jsonatr)

#414: add history variables to Lightclient_A_1.tla

e435ae3

#414: add use default index in testgen vote

f0e7599

#414: more structure for test cases

08450a7

#414: simplify running external commands

04ab1aa

#414: add simple TLA+ test

f7b3165

#414: run apalache on TLA+ test; fix Command spawn errors

cb478f2

#414: fix small typos

d65a72e

#414: refactor: add commonly used test utils

a720487

#414: more refactoring

49ef36d

#414: add tests/utils/jsonatr.rs

b80159c

#414: add tests/utils/jsonatr-lib

6a3831b

#414: full model-based test TLA+ -> Apalache -> Jsonatr -> lite::veri…

e36ba00

…fy_single

#414: check for tendermint-testgen in PATH

368975c

#414: testgen: add absent votes in the commit

da11fc1

#414: testgen: add time command; make timestamps abstract

8ff0f93

#414: make lite test timestamps abstract

4715e18

#414: LightClient_A_1: add time to history variables

d223773

#414: Lightclient_A_1.tla: store whole state sequence in history

0df4320

#414: more TLA+ lite tests

e9a099f

#414: Blockchain_A_1.tla: disallow generating headers with negative time

f04a163

#414 adjust static test to the new format

b860976

#414 adapt jsonatr-lib to history variables in the lite model

119088c

#414 testgen header: handle the case of empty validators

9932a1d

#414 lite-model-based: account for apalache run w/o counterexample

249a24f

#414: auto-run model-based test batches

0ff0b99

#414: add first test batch

2f03aef

andrey-kuprianov requested review from konnov, adizere, ebuchman, liamsi, shonfeder, josef-widder and Shivani912 August 21, 2020 10:51

andrey-kuprianov self-assigned this Aug 21, 2020

#414: remove redundant files

dbee1c9

shonfeder reviewed Aug 21, 2020

View reviewed changes

josef-widder reviewed Aug 25, 2020

View reviewed changes

konnov reviewed Aug 25, 2020

View reviewed changes

ebuchman closed this Aug 28, 2020

andrey-kuprianov added a commit that referenced this pull request Sep 4, 2020

#547 add missing file updates from #529

29807a4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Light Client model-based testing #529

Light Client model-based testing #529

andrey-kuprianov commented Aug 21, 2020 •

edited

Loading

shonfeder Aug 21, 2020

andrey-kuprianov Aug 21, 2020

shonfeder Aug 21, 2020 •

edited

Loading

shonfeder commented Aug 21, 2020

shonfeder commented Aug 21, 2020

andrey-kuprianov commented Aug 21, 2020 •

edited

Loading

shonfeder commented Aug 21, 2020 •

edited

Loading

josef-widder left a comment

josef-widder Aug 25, 2020

andrey-kuprianov Aug 26, 2020

josef-widder Aug 27, 2020

josef-widder Aug 27, 2020

josef-widder Aug 25, 2020

andrey-kuprianov Aug 26, 2020 •

edited

Loading

josef-widder Aug 27, 2020

konnov Sep 11, 2020

andrey-kuprianov Sep 11, 2020

josef-widder Aug 25, 2020

andrey-kuprianov Aug 26, 2020

josef-widder Aug 27, 2020

konnov left a comment

andrey-kuprianov commented Aug 28, 2020

ebuchman commented Aug 28, 2020

shonfeder commented Sep 8, 2020

Light Client model-based testing #529

Light Client model-based testing #529

Conversation

andrey-kuprianov commented Aug 21, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shonfeder Aug 21, 2020 • edited Loading

Choose a reason for hiding this comment

shonfeder commented Aug 21, 2020

shonfeder commented Aug 21, 2020

andrey-kuprianov commented Aug 21, 2020 • edited Loading

shonfeder commented Aug 21, 2020 • edited Loading

josef-widder left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrey-kuprianov Aug 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

konnov left a comment

Choose a reason for hiding this comment

andrey-kuprianov commented Aug 28, 2020

ebuchman commented Aug 28, 2020

shonfeder commented Sep 8, 2020

andrey-kuprianov commented Aug 21, 2020 •

edited

Loading

shonfeder Aug 21, 2020 •

edited

Loading

andrey-kuprianov commented Aug 21, 2020 •

edited

Loading

shonfeder commented Aug 21, 2020 •

edited

Loading

andrey-kuprianov Aug 26, 2020 •

edited

Loading