Changes to TLA+ specs for model-based testing #546

andrey-kuprianov · 2020-09-02T09:43:32Z

Part 1 of #529. See also #414. Changes to the TLA+ specs needed for model-based testing

introduced history variable into the LightClient spec
fixed a couple of divergences between the spec and the implementation, that resulted in failing model-based tests.
Referenced an issue explaining the need for the change
Updated all relevant documentation in docs
Updated all code comments where relevant
Wrote tests
Updated CHANGELOG.md

andrey-kuprianov · 2020-09-02T09:47:03Z

Here is the relevant reviews from the original PR: #529 (review)

andrey-kuprianov · 2020-09-11T08:52:42Z

The discussion from the original PR:

I think OK and CANNOT_VERIFY are used in the English spec. Are SUCCESS and NOT_ENOUGH_TRUST used in the implementation? If yes, we should adapt I guess the English spec.

@andrey-kuprianov

I've taken the verdicts from the implementation. There is another issue I've got, namely that according to the TLA+ constraints the verdict FAILED_TRUSTING_PERIOD cannot ever be produced. At least it's impossible to obtain a test with such verdict from Apalache, and my quick manual check on the TLA+ constraints tells me it's indeed logically impossible. Could you please verify that? In case it's true, it may be makes sense to remove such a verdict. And I would actually update the English spec as well, as I find the implementation names a bit more informative

@konnov

I don't see why FAILED_TRUSTING_PERIOD cannot be produced. It is produced in line 115. However, it is not propagated into state. Do you like to see it in state?

andrey-kuprianov · 2020-09-11T08:54:14Z

The problem I have is with the following model-based test:

\* This test never produces a counterexample; so the model should be corrected
TestFailedTrustingPeriod ==
   \E s \in DOMAIN history :
      history[s].verdict = "FAILED_TRUSTING_PERIOD"

andrey-kuprianov · 2020-09-11T08:55:18Z

@konnov The model is such that this verdict is logically excluded

konnov · 2020-09-11T09:29:12Z

@konnov The model is such that this verdict is logically excluded

We have to investigate this. Can it be that this verdict is overwritten by another block later? Since you are only checking the history post-factum, this can happen.

andrey-kuprianov · 2020-09-11T09:41:24Z

I am pretty sure it is excluded by the combination of logical constraints in the model... yes, let's investigate.

andrey-kuprianov · 2020-09-11T11:28:00Z

docs/spec/lightclient/verification/Blockchain_A_1.tla

@@ -110,7 +110,7 @@ FaultAssumption(pFaultyNodes, pNow, pBlockchain) ==
 (* Can a block be produced by a correct peer, or an authenticated Byzantine peer *)
 IsLightBlockAllowedByDigitalSignatures(ht, block) == 
    \/ block.header = blockchain[ht] \* signed by correct and faulty (maybe)
-    \/ block.Commits \subseteq Faulty /\ block.header.height = ht \* signed only by faulty
+    \/ block.Commits \subseteq Faulty /\ block.header.height = ht /\ block.header.time > 0 \* signed only by faulty


Suggested change

\/ block.Commits \subseteq Faulty /\ block.header.height = ht /\ block.header.time > 0 \* signed only by faulty

\/ block.Commits \subseteq Faulty /\ block.header.height = ht /\ block.header.time >= 0 \* signed only by faulty

andrey-kuprianov · 2020-09-11T11:31:00Z

docs/spec/lightclient/verification/Lightclient_A_1.tla

-  /\ thdr.time <= uhdr.time
+  /\ thdr.time < uhdr.time
+     \* the untrusted block is not from the future
+  /\ uhdr.time <= now


Suggested change

/\ uhdr.time <= now

/\ uhdr.time < now + CLOCK_DRIFT

shonfeder

Just reading the PR for my own edification and to start to get some exposure to our TLA specs.

I offered one naive suggestion about wording asked one question to help my reading of the code :)

shonfeder · 2020-09-11T16:31:09Z

docs/spec/lightclient/verification/Blockchain_A_1.tla

@@ -75,7 +75,7 @@ LBT == [header |-> BT, Commits |-> {NT}]

 (* the header is still within the trusting period *)
 InTrustingPeriod(header) ==
-    now <= header.time + TRUSTING_PERIOD
+    now < header.time + TRUSTING_PERIOD


I see the <= actually appears to contradict the english spec which says

- *LightStore* always contains a verified header whose age is less than the trusting period,

So it seems clear that what we ant is "less than", but perhaps this inconsistency was due in part to the wording and terminology here. E.g., in this paragraph

In order to ensure liveness, LightStore always needs to contain a
verified (or initially trusted) header whose time is within the
trusting period.

"trust period" is spoken of as a duration that the header needs to be within not as a limit point that the header time needs to be less than. I wonder if calling this TRUST_PERIOD_LIMIT might help prevent such confusion i the future. I suspect a problem with the current wording may arise due to "trusting period" sounds like a span within which things are valid, when it's actually meant to be the outer limit beyond which things are invalid. iiuc, the trust period is that span of time up to but not including the limit.

You are right: <= does contradict the English spec, so this is the TLA+ modeling artifact. That there can be ambiguities in the English spec, that's of course possible -- English is not a formal language after all;) I still think that our English spec does an amazing job of clarifying the requirements at the semi-formal (and as close to formal as possible) level. But exactly because of this non-strictness of the English language the TLA+ spec should, in my opinion, be the ultimate source of truth for all parties.

One of the things we were discussing with @josef-widder and @konnov is that now, when we have the three artifacts: the English spec, the TLA+ spec, and the implementation, and also now that we have the MBT, which serves as the "missing link" helping to synchronize them, we need to establish a process (in the VDD sense) that will help us to resolve the conflicts.

In this PR I've fixed the TLA+ spec, because it was the most easy way for me to go, but it's not clear at all whether that's the right thing to do. We need to figure out a joint understanding between the research and the development teams on how to proceed in such cases.

In this case, given the larger context, I don't believe it matters much whether we use < or <= here since we are dealing with time and already assume clocks can drift some amount. The important thing is that TRUST_PERIOD is a fraction of the UNBONDING_PERIOD. This ensures that light clients have the duration of UNBONDING_PERIOD - TRUST_PERIOD to try and submit evidence to the chain and get faulty validators punished. And since all these periods are on the order of days/weeks, the < vs. <= here shouldn't make a difference in the end.

Also while UNBONDING_PERIOD is a global consensus parameter (all nodes agree on), the TRUST_PERIOD is not, it's a local subjective parameter that clients can set depending on their risk tolerance. So again, the plank second difference between < and <= shouldn't matter there.

Of course that might be an unsatisfactory answer - I don't mean to be flippant about it and we should certainly be consistent and precise, but it's important to keep the practical context in mind so we know what details are most important to focus on. Of course this is just input from my feeble human mind so possibly machine will prove that it does matter, but thought it was worth providing the extra context here.

Thanks for the feedback! I totally agree, from the practical perspective these differences between < and <= do not matter at all in this particular context, so this case we could fix however we like, and it would make no real difference. There are two things to keep in mind though:

scientifically, we can't be approximately precise. And I would like to think about our specs as being precise, so we should clarify and fix any ambiguities.

this particular example is only a manifestation of a general problem about the artifacts we are having, and I am sure we will encounter more instances of it, with probably higher level of severity. So it's better if we are prepared for that.

Also I tend to perceive us having this problem as rather a good thing, demonstrating that from now on we will have a living specification, which is in sync with the implementation.

This is only the input from another feeble human mind, just a bit different;) And I am sure fixing this will not take much of either our focus or time.

shonfeder · 2020-09-11T16:54:08Z

docs/spec/lightclient/verification/Lightclient_A_1.tla

+   - verdict: the light client verdict for the `current` block in this state
+*)
+VARIABLE
+  history


Just a question for my learning, I don't see the history function actually getting used anywhere except on https://github.com/informalsystems/tendermint-rs/pull/546/files#diff-04b72e322e067c19d149fbabcfe5a830R272 where is is UNCHANGED. Could you maybe point out to me where we actually apply the function to make use of the records in its codomain?

It's used here:

LightTests.tla to describe the TLA+ tests,

and here apalache-tendermint.json to extract the information from the Apalache counterexample.

It was part of another PR, which is already merged.

Also, as we discussed with @josef-widder and @konnov, the way I introduce the history variable needs to be changed, because this feature is not needed for all users of the spec. So we need to figure out how to modularize the spec, and to allow to add the history as an extension.

Ah, got it. Thanks very much for the pointers!

andrey-kuprianov · 2020-09-22T09:56:41Z

@josef-widder @konnov Here is the logical explanation why "FAILED_TRUSTING_PERIOD" verdict is impossible:

For it to be issued the following should hold:

ValidAndVerifiedPre(trusted, untrusted)
~BC!InTrustingPeriod(untrusted.header)

Denoting trusted.header.time as t, and untrusted.header.time as u, we get these constraints

now < t + TRUSTING_PERIOD
t < u
u <= now
now >= u + TRUSTING_PERIOD

which, combined, give this inequality:

u <= now - TRUSTING_PERIOD < t < u <= now

which is impossible.

andrey-kuprianov · 2020-09-22T10:02:32Z

To be more precise, this is impossible taking into account the changes to the specs I propose, because those changes make strict inequalities out of two non-strict ones. Without those changes this verdict would be possible with a single valuation, namely:

u = now - TRUSTING_PERIOD = t = u

konnov · 2020-09-22T10:21:54Z

Looks like we have to add clock drift, to make sense of error condition.

andrey-kuprianov · 2020-09-22T10:38:39Z

yes, as we've discussed here

Also, this will make a lot of sense from the implementation and testing points of view, as the implementation does have it

andrey-kuprianov · 2020-09-24T08:16:35Z

@konnov @josef-widder I've implemented our discussions, in particular factored out building the history chain from the main LightClient spec into the testing spec. Could you please review the changes to *.tla files in this PR?

For now this is still shown as the changes to the TLA files from docs/spec, to facilitate the review. After you review them, I will rename them, as we discussed, as Blockchain_0_0_2_draft.tla and Lightclient_0_0_2_draft.tla, place into the local LightClient folder for MBT-purposes, and open an issue against tendermint/spec repo.

Thanks!

josef-widder · 2020-09-24T13:22:33Z

Files names should be like Blockchain_002_draft.tla.

konnov

The changes look good to me. We should follow up with adding a local clock and CLOCK_DRIFT in tendermint/170.

docs/spec/lightclient/verification/Lightclient_A_1.tla

josef-widder

Looks great. Thanks!

#414 changes to TLA+ specs for model-based testing

26fb58e

andrey-kuprianov added light-client Issues/features which involve the light client verification tests labels Sep 2, 2020

andrey-kuprianov requested review from konnov and josef-widder September 2, 2020 09:43

andrey-kuprianov self-assigned this Sep 2, 2020

andrey-kuprianov changed the title ~~#414 changes to TLA+ specs for model-based testing~~ Changes to TLA+ specs for model-based testing Sep 2, 2020

Lightclient_A_1.tla: add check for header from the future

2a2ebee

andrey-kuprianov mentioned this pull request Sep 11, 2020

Light Client model-based testing #529

Closed

5 tasks

andrey-kuprianov commented Sep 11, 2020

View reviewed changes

andrey-kuprianov linked an issue Sep 11, 2020 that may be closed by this pull request

Develop a set of model-based tests for the LightClient #414

Closed

shonfeder reviewed Sep 11, 2020

View reviewed changes

andrey-kuprianov mentioned this pull request Sep 14, 2020

Sort validators by voting power #556

Closed

5 tasks

andrey-kuprianov added 5 commits September 23, 2020 11:58

Merge branch 'master' into andrey/mbt-spec-changes

b479d59

#414 fix testgen tester: wrong dir was removed

85b8814

#414 move history out of LightClient.tla into LightTests.tla

97844a6

#414 adapt apalache test runner to changes in specs

45ab864

#414 minor changes to model_based.rs

ef474ff

andrey-kuprianov added 5 commits September 24, 2020 10:56

#579 fix validators sorting according to v.0.34 requirements

7edec1e

#579 update CHANGELOG.md

9c55866

Merge branch 'andrey/valset_ordering' into andrey/mbt-spec-changes

702c87b

#579 sort validators correctly also in testgen

43e4051

Merge branch 'andrey/valset_ordering' into andrey/mbt-spec-changes

1f02ef0

josef-widder mentioned this pull request Sep 24, 2020

tla+ add local time to light client verifier tendermint/spec#170

Closed

konnov previously approved these changes Sep 24, 2020

View reviewed changes

docs/spec/lightclient/verification/Lightclient_A_1.tla Outdated Show resolved Hide resolved

docs/spec/lightclient/verification/Lightclient_A_1.tla Outdated Show resolved Hide resolved

docs/spec/lightclient/verification/Lightclient_A_1.tla Outdated Show resolved Hide resolved

andrey-kuprianov added 3 commits September 24, 2020 15:59

#414 account for Igor's naming suggestions

6ab98d3

#414 copy and rename into Blockchain/Lightclient_002_draft.tla

c1a421b

Merge branch 'master' into andrey/mbt-spec-changes

3d8c480

andrey-kuprianov dismissed konnov’s stale review via 3d8c480 September 24, 2020 14:32

josef-widder previously approved these changes Sep 24, 2020

View reviewed changes

#414 restore original Blockchain/Lightclient_A_1.tla

65bd05a

andrey-kuprianov dismissed josef-widder’s stale review via 65bd05a September 24, 2020 16:21

josef-widder self-requested a review September 24, 2020 16:25

josef-widder approved these changes Sep 24, 2020

View reviewed changes

andrey-kuprianov merged commit 856084e into master Sep 24, 2020

andrey-kuprianov deleted the andrey/mbt-spec-changes branch September 24, 2020 16:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes to TLA+ specs for model-based testing #546

Changes to TLA+ specs for model-based testing #546

andrey-kuprianov commented Sep 2, 2020 •

edited

Loading

andrey-kuprianov commented Sep 2, 2020

andrey-kuprianov commented Sep 11, 2020

andrey-kuprianov commented Sep 11, 2020

andrey-kuprianov commented Sep 11, 2020

konnov commented Sep 11, 2020

andrey-kuprianov commented Sep 11, 2020

andrey-kuprianov Sep 11, 2020

andrey-kuprianov Sep 11, 2020

shonfeder left a comment

shonfeder Sep 11, 2020

andrey-kuprianov Sep 11, 2020

ebuchman Sep 12, 2020 •

edited

Loading

andrey-kuprianov Sep 12, 2020

shonfeder Sep 11, 2020

andrey-kuprianov Sep 11, 2020

andrey-kuprianov Sep 11, 2020 •

edited

Loading

shonfeder Sep 11, 2020

andrey-kuprianov commented Sep 22, 2020 •

edited

Loading

andrey-kuprianov commented Sep 22, 2020

konnov commented Sep 22, 2020

andrey-kuprianov commented Sep 22, 2020

andrey-kuprianov commented Sep 24, 2020

josef-widder commented Sep 24, 2020

konnov left a comment

josef-widder left a comment

	\/ block.Commits \subseteq Faulty /\ block.header.height = ht /\ block.header.time > 0 \* signed only by faulty
	\/ block.Commits \subseteq Faulty /\ block.header.height = ht /\ block.header.time >= 0 \* signed only by faulty

Changes to TLA+ specs for model-based testing #546

Changes to TLA+ specs for model-based testing #546

Conversation

andrey-kuprianov commented Sep 2, 2020 • edited Loading

andrey-kuprianov commented Sep 2, 2020

andrey-kuprianov commented Sep 11, 2020

andrey-kuprianov commented Sep 11, 2020

andrey-kuprianov commented Sep 11, 2020

konnov commented Sep 11, 2020

andrey-kuprianov commented Sep 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shonfeder left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ebuchman Sep 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrey-kuprianov Sep 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrey-kuprianov commented Sep 22, 2020 • edited Loading

andrey-kuprianov commented Sep 22, 2020

konnov commented Sep 22, 2020

andrey-kuprianov commented Sep 22, 2020

andrey-kuprianov commented Sep 24, 2020

josef-widder commented Sep 24, 2020

konnov left a comment

Choose a reason for hiding this comment

josef-widder left a comment

Choose a reason for hiding this comment

andrey-kuprianov commented Sep 2, 2020 •

edited

Loading

ebuchman Sep 12, 2020 •

edited

Loading

andrey-kuprianov Sep 11, 2020 •

edited

Loading

andrey-kuprianov commented Sep 22, 2020 •

edited

Loading