forked from cosmos/ibc-rs
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Disclosure log to record ICS3 liveness problems (cosmos#83)
* Added disclosure log with 2 items. * Fixed deadlock confusion; more details for each record * More explanation for the second record in the disclosure log * Apply suggestions from code review Co-authored-by: Anca Zamfir <[email protected]> * More thorough explanation and aligned record 2 with the default TLA+ spec. * Aligned the trace w/ the most recent version of the spec. * Apply suggestions from code review Co-authored-by: Anca Zamfir <[email protected]> * Update docs/disclosure-log.md Co-authored-by: Anca Zamfir <[email protected]> * Removed explicit reference to proofs (at h-1 vs. h) from item 2. * Update docs/disclosure-log.md Co-authored-by: Anca Zamfir <[email protected]> Co-authored-by: Anca Zamfir <[email protected]>
- Loading branch information
1 parent
50d6e66
commit e1b7894
Showing
1 changed file
with
96 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
# Disclosure Log for IBC Protocols | ||
|
||
This document is a record of all the bugs or issues we uncovered while specifying & formally verifying the IBC protocols. | ||
|
||
|
||
### 1. ICS3 liveness problem due to ICS018 relayer algorithm | ||
|
||
The algorithm for relaying connection handshake datagrams of type `ConnOpenTry`does not handle the situation when both chains are in state `INIT`. | ||
The current relayer algorithm in [ICS018](https://github.com/cosmos/ics/tree/master/spec/ics-018-relayer-algorithms) specifies that the `ConnOpenTry` datagram should be relayed only if one of the chains is in state `INIT` and the other chain is uninitialized (see the snippet below); this is not enough for guaranteeing liveness of the connection handshake protocol (ICS04). | ||
|
||
``` | ||
if (localEnd.state === INIT && remoteEnd === null) | ||
``` | ||
|
||
The correct code should include both the cases when a single chain is in state `INIT`, as well as the case when both chains are in state `INIT`, as specified here: [Relayer.tla](https://github.com/informalsystems/ibc-rs/blob/master/docs/spec/relayer/Relayer.tla#L174) | ||
This fix only concerns the relayer algorithm ICS018. | ||
|
||
##### Channel handshake (ICS4) liveness problem | ||
|
||
The same issue (and fix) seems to exist for the channel handshake datagrams. | ||
|
||
|
||
### 2. ICS3 liveness problem due to `UpdateClient` semantics | ||
|
||
This problem is not specific to the connection handshake protocol (ICS3) itself, but is a bug in the way the relayers use the `UpdateClient` action. | ||
We classify this under ICS3, however, since this is the context where we discovered the problem. | ||
The TLA+ spec we have for depicting this liveness problem is for the ICS3 protocol. | ||
|
||
##### Problem statement | ||
|
||
Related issues: [#71](https://github.com/informalsystems/ibc-rs/issues/71) and [#61](https://github.com/informalsystems/ibc-rs/issues/61). | ||
The problem is more thorougly described in #61, but for completeness sake we restated it here in a compact form. | ||
|
||
The liveness property that a correct relayer should provide is eventual delivery. | ||
Assuming some source chain `A`, destination chain `B`, and an IBC item `X` (e.g., connection, channel, or packet) on chain `A`, we can define this property as follows: | ||
|
||
> For any IBC item `X` on chain `A` destined for chain `B`, eventually, a correct relayer will submit item `X` to chain `B`. | ||
This is difficult to guarantee in practice, however. | ||
Intuitively, the difficulty arises because of a combination of two factors: | ||
|
||
1. __Proof requirement:__ For chain `B` to accept item `X`, it must verify the authenticity of this item; this is done via the light client that chain `B` maintains. | ||
Given an item `X` and a commitment proof for `X` constructed at height `h-1`, the light client requires the consensus state at height `h` that includes that commitment root required for verification. | ||
|
||
2. __Concurrency:__ Different relayers may update the same light client. | ||
Suppose a relayer `r1` wants to submit a consensus state at height `h`. | ||
In the meantime, however, another relayer `r2` may update this same light client to height `h'`. | ||
Assume `h'` is bigger than `h`. | ||
If the light client disallows updates with heights smaller than the current height `h'` then `r1`'s update fails . | ||
Consequently, the relayer will be unable to submit consensus state at height `h`. | ||
|
||
To ensure eventual delivery, relayer `r1` would need to retry submitting item `X`, that is: resubmit the consensus state at a larger height (e.g., at `h'`) followed by the message that includes the proof for `X` (e.g., at `h'-1`). | ||
This retry mechanism was adoped as a solution for the [current relayer implementation](https://github.com/informalsystems/ibc-rs/blob/master/docs/architecture/adr-002-ibc-relayer.md#ibc-client-consensus-state-vs-relayer-light-client-states-vs-chain-states). | ||
Note that it is also possible for relayer `r2` to have submitted the same item `X` successfully; in this case, the liveness problem does not actually surface. | ||
|
||
|
||
##### TLA+ trace | ||
|
||
To obtain an execution in TLA+ that depicts the above liveness problem, it is sufficient to enable the `Concurrency` flag in the L2 default TLA+ spec for ICS3. | ||
This spec is located in [spec/connection-handshake/L2-tla/](./spec/connection-handshake/L2-tla/). | ||
In this spec we make a few simplifications compared to the real system, most importantly: to verify an item at height `h`, a light client can use the consensus state at the same height `h` (no need for smaller height `h-1`). | ||
Below we summarize the parameters as well as the sequence of actions that lead to the liveness problem. | ||
|
||
###### Parameters: | ||
|
||
- `MaxBufLen <- 2` | ||
- `MaxHeight <- 8` | ||
- `Concurrency <- TRUE` | ||
- Behavior spec: Temporal formula `Spec` | ||
- Check for `Deadlock`, Invariants `TypeInvariant` and `ConsistencyProperty`, as well as Property `Termination` | ||
|
||
###### Trace: | ||
|
||
Both chains `A` and `B` start at height `1`, and the light client on each chain has consensus state for height `1`. | ||
|
||
1. The environment submits a `ICS3MsgInit` message to chain `A`. | ||
|
||
2. Chain `A` processes the `ICS3MsgInit`, advances to height `2`, and prepares a `ICS3MsgTry` message destined for chain `B`. | ||
The proof in this message is for height `2`. | ||
|
||
3. The environment triggers the `AdvanceChainHeight` action of chain `B`, so this chain transitions from height `1` to height `2`. | ||
|
||
4. The environment triggers the `AdvanceChainHeight` action of chain `A`, so this chain transitions from height `2` to height `3`. | ||
|
||
5. The environment triggers the `AdvanceChainHeight` action of chain `A`, so this chain transitions from height `3` to height `4`. | ||
|
||
6. __Concurrency:__ The environment triggers the `UpdateClient` action on chain `B`: the light client on this chain is updated with height `4` (that is, the latest height of chain `A`), and chain `B` also transitions from height `2` to height `3`. | ||
|
||
7. The environment passes (i.e., relays) the `ICS3MsgTry` message to chain `B`. | ||
Recall that this message has proofs for height `2`; consenquently, the environment also attempts to trigger `UpdateClient` action on chain `B` for consensus state at height `2`. | ||
This action does not enable because the light client on `B` has a more recent consensus state for height `4`. | ||
|
||
8. Chain `B` attempts to process the `ICS3MsgTry` but is unable to verify its authenticity, since the light client on this chain does not have the required consensus state at height `2`. | ||
Chain `B` drops this message. | ||
|
||
From this point on, the model stutters, i.e., is unable to progress further in the connection handshake protocol. |